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REMARKS 

These remarks are in response to the Office Action mailed February 15, 2005. Claims 1 
and 18 have been amended. Claims 25-32 have been added. Claims 13-16 are canceled without 
prejudice. Claims 2-12, 19, and 21-22 were previously canceled. No claims are withdrawn. 
Thus, claims 1, 17-18, 20 and 25-32 are presently pending. Hence, Applicants submit that a new 
search is not necessary, and that all amendments are supported by the previously pending claims 
and originally filed specification. 
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Also, SEQ ID NOs: 3 (accession no. AY040619), 4 (accession no. AY040620) and 5 
(accession no. AY040622) were first seen on the NCBI database on August 23, 2002. The 
deposit satisfies the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the Purposes of Patent Procedure (1977). Applicants amended the 
specification (paragraphs [0026] to [0028]) in the response to the Office Action mailed 
September 15, 2003 (page 2-3 and Exhibit A of the response) to reflect the above. 
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II. Amendments to the Specification 

Paragraph [0026] has been amended to particularly point out the subject matter of the 
claimed invention, which are 16S rRNA sequences from a strain of Salinospora containing 
certain signature nucleotides. The 16S rRNA sequences are patentable irrespective of how the 
signature sequences are identified. Applicants note, that in order to identify and characterize a 
microbial organism, it is standard in the art to perform a computational alignment analysis 
against other known 16S rRNA gene sequences, including but not limited to, the E.coli 16S 
rRNA sequence. The E.coli 16S rRNA sequence is, however, the reference sequence with which 
all other 16S rRNA sequences are compared. This aspect will be made more clear in the 
following response and in the expert Declaration under 37 CFR § 1.132 of Dr. Stephen 
Giovannoni (see Appendix and Exhibits A, B and C). Therefore, paragraph [0026] has been 
amended to particularly reference E.coli as the standard for the 16S rRNA gene sequences. 

Table 3 of the specification (page 12) has been amended to include another column 
indicating the actual nucleotide positions of the signature nucleotides of the claimed Salinospora 
actinomycete isolates. Applicants have amended Table 3 to eliminate the confusion of the use of 
the phrase "E.coli numbering system" with respect to the claimed subject matter. The signature 
sequences of the Salinospora isolates of the instant invention are not dependent on the E.coli 16S 
rRNA sequence, the E.coli sequence is only used as a relative marker with respect to the 
Salinospora sequences. That is, the signature sequences of the instant invention are "signature" 
because they do not exist in Micromonosporaceae 16S rRNAs sequences, and not because they 
are identified based on the E.coli 16S rRNA sequence. Thus, the amendment to Table 3 does not 
add new matter. 
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II. Amendments to the Claims 

Claims 1 and 18 have been amended. Claims 1 and 18 have been amended to improve 
their form. The claimed invention relates to 16S rRNA signature nucleotides of at least 3 
Salinospora isolates, e.g. SEQ ID NOs: 3, 4 and 5. 

Claim 1 has been amended and now recites, "[a]n isolated marine actinomycete having an 
obligate requirement of sodium for growth, wherein the marine actinomycete is a strain of 
Salinospora comprising 16S rRNAs SEQ ID NO:3, 4 or 5." Amended claim 1 contains 
patentable subject matter, irrespective of the E.coli 16S rRNA sequence numbering system, 
because as discussed above, the signature sequences are "signature" because they are not found 
in Micromonosporacea genera, and not due to their numbering system. 

The amendments to claims 1 and 18 are fully supported by the specification, e.g. 
paragraphs [0026] and [0027]. Paragraphs [0026] and [0027] were first amended in the response 
mailed December 15, 2003 to the Office Action mailed September 15, 2003, to include the 
sequence identifiers (SEQ ID NOs: 3, 4 and 5) of the described isolates (accession numbers: 
AY040619, AY040620 and AY040622, respectively). The accession numbers were disclosed in 
the application as filed. Hence, it is clear that Applicants were in possession of the nucleotide 
sequences at the time the application was filed. 

Claims 13-16 have been canceled because they have been incorporated into independent 
claim 1. Claims 25-30 have been added. 

Claims 25-30 do not add new subject matter because they simply claim that which was 
previously presented in pending claim 1. 

Accordingly, Applicants submit that a new search is not necessary, and that all 
amendments to the claims are supported by the application as filed and by the previously 
pending claims. 
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III. Rejection Under 35 U.S.C S112 

A. 35 U.S.C. §112, first paragraph (written description) 

Claims 1, 13 and 17-18 and 20 stand rejected under 35 U.S.C. §1 12, first paragraph as 
allegedly failing to comply with the written description requirement. Applicants respectfully 
traverse this rejection. 



Claims 1 and 18 have been amended and claims 25-30 have been added. Claims 1 and 
18 have been amended per the suggestion of the Office Action on page 4. 

Claim 13 has been canceled, making the rejection with regards to claim 13 moot. 



Claim 1 has been amended to incorporate the limitation that the isolates contain SEQ 
ID NOs: 3, 4 or 5. Claim 1 has also been amended to delete the recitation relating to 
identification of the signature sequences based on the E.coli 16S rRNA numbering system in 
the Ribosomal Database Project (RDP). Attached herein is a Declaration under 37 CFR § 
1.132 by Dr. Stephen Giovannoni (see Appendix and Exhibits A-C). 

Dr. Giovannoni is an expert in the area of microbial identification and characterization. 
Attention is drawn to paragraphs 3, 4 and 5 of the Declaration of Dr. Giovannoni, whereby Dr. 
Giovannoni states: 

3. First, as "one skilled in the art," I declare that that the universal 
"gold standard" for microbial classification is a computational alignment 
analysis against existing 16S rRNA gene sequences. This same "gold 
standard" was utilized in the above-identified patent application to identify 
and characterize the signature sequences in the Salinospora isolates. 

4. Second, Bergey's Manual of Systematic Bacteriology (George M. 
Garrity, David R. Boone, Editors) is a text used worldwide, and supports the 
above mentioned "gold standard" (i.e. microbial identification and 
classification schemes rely on 16S rRNA sequences). Bergey's Manual of 
Systematic Bacteriology, is also the taxonomic hierarchy used in the 
Ribosomal Database Project (RDP). The RDP provides aligned and 
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annotated rRNA gene sequences, and is made available through the RDP 
website (http://rdp.cme.msu.edu/). See also Cole et al, "The ribosomal 
database project (RDP-II): sequences and tools for high-throughput rRNA 
analysis," Nucleic Acids Research, 33:D294-296 (2005), which is hereby 
attached as Exhibit B. The signature nucleotides of the Salinospora 16S 
rRNA sequences claimed in the above identified application were aligned 
against other 16S rRNA gene sequences, including members of 
Micromonosporaceae in the RDP (see paragraph [0028] and Table 3 of the 
specification). 

5. Third, Brosius et al. (1978) was first to publish the complete 16S 
rRNA of E.coli (please see Brosius et al., "Complete nucleotide sequence of 
16S ribosomal RNA gene from Escherichia coli," Proc. Natl Acad. ScL 
USA 75(l):4801-4805, which is hereby attached as Exhibit C). The Brosius 
16S rRNA E.coli gene sequence is the reference sequence used worldwide 
and is the sequence found on the RDP. This is true regardless of any 
changes and updates to the sequence. To put it another way, the E.coli 16S 
rRNA gene sequence on the RDP is "permanent and invariant." Therefore, 
the Applicants mention of "E.coli positions 27-1492" is a reference to the 
Brosius' E.coli 16S rRNA gene sequence. Those skilled in the art will 
immediately understand that the Applicants were in possession of the 
claimed subject matter based on a comparison of the invention Salinospora 
16S rRNA sequences with E.coli 16S rRNA sequence. 

Thus, as demonstrated, in part, by the Declaration of Dr. Giovannoni above, computational 
alignments using small rRNA gene sequences are universally accepted, and is the "gold 
standard;" that Bergey ? s Manual of Systematic Bacteriology supports the "gold standard" use 
of 16S rRNA in computational alignment analysis, and that this taxonomic hierarchy is used in 
the RDP; that the Brosius et al. (1978) E.coli 16S rRNA gene sequence is the same E.coli 16S 
rRNA sequence used worldwide today, regardless of any changes that may be found in the 
E.coli sequence after that date; and that the Brosius et al. E.coli 16S rRNA sequence is the 
same sequence referenced in Applicants application and found in the RDP. 
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Additionally, Carl Woese, an eminent scholar in the area of microbial diversity and 

winner of the National Medal of Science (2000), states that: 

rRNAs are at present the most useful and most used of the molecular 
chronometers. They show a high degree of functional constancy, which 
assures relatively good clocklike behavior. They occur in all organisms, and 
different positions in their sequences change at very different rates, allowing 
most phylogenetic relationships (including the most distant) to be measured, 
which makes their range all-encompassing. Carl R. Woese, "Bacterial 
Evolution," Microbiological Reviews, pp. 221-271 (1987); see Attachment 1. 

The statement that "different positions in their sequences change at very different rates," is a 
reference to signature nucleotides found within particular groups or clades of organisms. 
Stackenbrandt et al. (1997), supports this same notion that "[signature nucleotides are 
derivatives of the classification process; i.e. signatures are determined for those organisms that 
are contained within a particular data set." Stackenbrandt et al., "Proposal for a new hierarchic 
classification system, Actinobacteria classis nov.,"/«/. J. Systematic Bacteriology, 47(2):479- 
491 (1997); see Attachment 2. For example, Stackenbrandt et al. show that the genus 
Micromonosporaceae have a pattern of 16S rDNA signatures consisting of nucleotides at 
positions 66-103 (G-C), 127-234 (A-U), 153-168 (C-G), 502-543 (G-C), 589-650 (C-G), etc. 
(page 487, left hand column, first paragraph). Further to this, Stackenbrandt et al., supra, 
describe phylogenetic studies based on alignments obtained from the RDP (see page 480, left 
hand column, under Materials and Methods). 

Hence, the use of rRNA as molecular chronometers or to determine phylogeny has been 
used in the art for many years. It is standard in the art to define, identify or characterize new 
microbial organisms based on signature sequences as determined from a computational 
alignment analysis using 16S rRNA sequences. It is standard in the art to use the E.coli 16S 
rRNA gene sequence as the reference sequence for identifying the nucleotide positions of the 
signature sequences (see page 480, right hand column, first paragraph, of Stackenbrandt et al.). 
Lastly, it is standard in the art to use the Brosius et al (1978) E.coli 16S rRNA gene sequence 
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to determine phylogeny with respect to microorganisms(see page 480, right hand column, first 
paragraph). 

New claims 25-30 also meet the written description requirement because they claim 
identical subject matter as that previously presented in claim 1. 

Therefore, Applicants submit that based on the discussion above, Dr. Giovannoni's 
Declaration under 37 CFR § 1.132, and the Woese and Stackenbrandt et al. references 
(Attachments 1 and 2), Applicants provided a complete disclosure of sufficiently detailed, 
relevant identifying characteristics to show the Applicants were in possession of the claimed 
invention. 

Accordingly, withdrawal of the rejection of claims 1,13 and 17-18 and 20 under 35 
U.S.C. §112, first paragraph is respectfully requested. 



B. 35 U.S.C. §112, second paragraph (indefiniteness) 

Claims 1, 13-18 and 20 stand rejected under 35 U.S.C. §112, second paragraph as 
allegedly being indefinite for failing to particularly point out and distinctly claim the subject 
matter which Applicant regards as the invention. Applicants respectfully traverse this rejection. 

Claims 1 and 18 have been amended and claims 25-30 have been added. Claims 13-16 
have been canceled. Claims 1 and 18 have been amended per the suggestion of the Office 
Action on page 4 (i.e. for claim 1 to include the limitations of claims 13-16, and for claim 18 to 
include the limitation of "sodium containing" growth medium). As described above, the 
amendments to the claims and paragraph [0026] and Table 3 of the specification are made to 
improve their form and clarity without adding new subject matter. The amendments to the 
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claims and specification are supported in the specification as filed and in the previous pending 
claims. Therefore, the metes and bounds of the claimed invention are clearly set forth. 

Accordingly, withdrawal of the rejection of claims 1, 13-18 and 20 under 35 U.S.C. 
§112, second paragraph is respectfully requested. 



GTV6436305.4 
328342-202 



In re Application of: 
Fenical et al. 



PATENT 

Attorney Docket No.: UCSD1630-1 



Application No. : 09/99 1,518 
Filed: November 16, 2001 
Page 15 

IV. Conclusion 

Applicants submit that the pending claims are in condition for allowance. Reexamination, 
reconsideration, withdrawal of the rejections, and early indication of allowance are requested 
respectfully. If any questions remain, the Examiner is urged to contact the undersigned below. 

No fee is believed due in connection with this Amendment. If any additional fees are due, the 
Commissioner is hereby authorized to charge any fees that may be required by this paper to Deposit 
Account No. 07-1896 . A duplicate copy of this Transmittal Sheet is attached. 



Respectfully submitted, 



Date: May 20. 2005 




Lisa AA&aile, J.D., Ph.D. 
Registration No. 38,347 
Telephone: (858) 677-1456 
Facsimile: (858) 677-1465 



DLA PIPER RUDNICK GRAY CARY US LLP 



4365 Executive Drive, Suite 1 100 
San Diego, California 92121-2133 
USPTO CUSTOMER NO. 28213 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicants: Fenicaletal. Art Unit: 1651 

Application No.: 09/991,518 Examiner. I. Marx 

Filing Date: November 16, 2001 Confirmation No.: 7755 

Title: MARINE ACTINOMYCETE TAXON FOR DRUG AND 

FERMENTATION PRODUCT DISCOVERY 



Commissioner for Patents 
P.O. Box 1450 
Alexandria, V A 22313 



Sir: 



DECLARATION OF 
APPLICANT UNDER 37 C.F.H S 1,132 



I, Stephen J. Giovannoni, do hereby declare and state that: 



1 . I am a Professor at Oregon State University in Corvallis, Oregon. I conduct 
research and have published many articles relating to microbial identification and classification, 
including those based on 16S rRNA alignment and analysis. Currently, I am teaching a course 
titled, "Microbial Diversity," which relates directly to my area of expertise and interest, as well 
as to the subject matter of the above-identified patent application. My curriculum vitqe is 
attached as Exhibit A. 

2. 1 am on the Scientific Advisory Board of Nereus Pharmaceuticals, located at 
10480 Wateridge Circle, San Diego, California 92i21. Nereus Pharmaceuticals is the exclusive 
licensee of the above-identified matter. 

3. I understand that claims 1, 13, 17-18 and 20 have been rejected under 35 
U.S.C. § 1 12, first paragraph, in the above-identified patent application, for allegedly failing to 
comply with the written description requirement. I understand that the alleged failure to satisfy 
the written description requirement means that the claimed subject matter was allegedly not 
described in the application in such a way as to reasonably convey to one skilled in thje relevant 
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art, such as myself, that the inventors), at the time the application was filed, had possession of 
the claimed invention. 

4. First, as "one skilled in the art " I declare that that the universal "gold standard' 
for microbial classification is a computational alignment analysis against existing 16S rRNA 
gene sequences, This same "gold standard" was utilized in the above-identified patent 
application to identify and characterize the signature sequences in the Salinospora iso 



ates. 



5. Second, Sergey's Manual of Systematic Bacteriology (George M Gamty, David 
R. Boone, Editors) is a text used worldwide, and supports the above mentioned "gold standard" 
(i.e. microbial identification and classification schemes rely on 16S rRNA sequences) Berge/s 
Manual of Systematic Bacteriology, is also the taxonomic hierarchy used in the Ribosomal 
Database Project (RDP). The RDP provides aligned and annotated rRNA gene sequences, and is 
made available through the RDP website (http://rdp.cme.msu.edu/). See also Cole et al, 'The 
ribosomal database project (RDP-H): sequences and tools for high-throughput rRNA analysis," 
Nucleic Acids Research, 33:D294-296 (2005), which is hereby attached as Exhibit B. The 
signature nucleotides of the Salinospora 16S rRNA sequences claimed in the above id entified 
application were aligned against other 16S rRNA gene sequences, including members of 
Micromonosporaceae in the RDP (see paragraph [0028] and Table 3 of the specification). 

6. Third, Brosius et al. (1978) was first to publish the complete 16S rRNA of Kcoli 
(please see Brosius et al., "Complete nucleotide sequence of 16S ribosomal RNA gene from 
Escherichia coli," Proc. Natl Acad. Set USA 75(l):4801-4805, which is hereby attached as 
Exhibit C). The Brosius 16S rRNA E.coli gene sequence is the reference sequence used 
worldwide and is the sequence found on the RDP. This is true regardless of any changes and 
updates to the sequence. To put it another way, the E.coli 16S rRNA gene sequence on the RDP 
is ''permanent and invariant." Therefore, the Applicants mention ot"E.coli positions 
a reference to the Brosius' E.coli 16S rRNA gene sequence. Those skilled in the art \vill 
immediately understand that the Applicants were in possession of the claimed subject 
based on a comparison of the invention Salinospora 16S rRNA sequences with E.coli 
sequence. 
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1. I further declare that all statements made herein of knowledge are true and that all 
statements made on information and belief are believed to be true, and further that these 
statements were made with the knowledge that willful false statements and the like so made arc 
punishable by fine, or imprisonment, or both under Section 1 00 1 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issued thereon. 
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CURRICULUM VITAE 
Stephen J. Giovannoni 



Department of Microbiology 


Telephone: (541) 737-1835 


Oregon State University 


Fax: (541) 737-0496 


Corvallis, OR 97331-3804 U.S.A. 


E-mail: steve.giovannoni@oregonstate.edu 


Education: 




• University of Oregon; Ph.D. in Biology 


1984 


• Boston University; M.A. in Biology 


1978 



• University of California, San Diego; B. A. in Biology 1974 
Research Interests: 

• Oceanic and Freshwater Bacterioplankton 

• Microbial Genomics 

• Microbial Life in the Oceanic Lithosphere 

Professional Experience: 

• Director, Molecular and Cellular Biology Program, 
Oregon State University, Corvallis. 

• Professor, Department of Microbiology, 
Oregon State University, Corvallis. 

• Associate Professor, Department of Microbiology, 
Oregon State University, Corvallis. 

• Assistant Professor, Department of Microbiology, 
Oregon State University, Corvallis. 

• NSF Postdoctoral Research Fellow with Norman Pace, 
Indiana University, Bloomington. 

• Instructor, Department of Biology, University of Oregon, Eugene 

• Graduate Research Assistant with Richard Castenholz, 
Department of Biology, University of Oregon, Eugene. 

• Research Associate with Edward Leadbetter, Biological Sciences 
Department, University of Connecticut. 

• Graduate Teaching Fellow with Lynn Margulis, Department of 
Biology, Boston University, Boston. 

• Research Associate with George Feher, Department of Physics, 
University of California, San Diego. 



2000-present 

1999-present 

1993-1999 

1988-1993 

1984-1988 
1984 

1979-1984 
1978-1979 
1975-1978 
1974-1975 



Honors and Awards: 

• Pernot Endowed Professor, OSU Department of Microbiology 2005 

• Milton Harris Award for Exceptional Achievement in Microbiology 

College of Science, Oregon State University 2003 

• Fellow, American Academy of Microbiology 1997 

• Sugihara Young Faculty Research Award, 

College of Science, Oregon State University 1994 

• Emerging Scholar Award, Phi Kappa Phi 1993 

• Morgenroth Award for Exceptional Achievement as a Graduate Student 
University of Oregon 1984 



Recent Teaching Activity at Oregon State University: 



1 



• Full responsibility for Genomics and Cellular Evolution (MB668), yearly. 

• Full responsibility for Microbial Diversity (MB420/520), alternate years. 

• Full responsibility for Microbial Diversity Laboratory (MB421/521), alternate years 1989 

• Fall, '03, MCB511 Research Perspectives (1.5 lecture hours) 

• Fall, '03/ MCB 526 Advanced Biotechnology Techniques (3 lecture hours) 

• Fall, '03, OC669 Principles of the Subsurface Biosphere (2 lecture hours) 



Graduate Training: 

• Major or co-advisor for five completed Ph.D.s and six masters degrees. 

• Service on over 30 graduate committees. 

Teaching (other institutions): 

• Co-instructor in Marine Genomics, The Bermuda Biological Station for Research, July 2003, 2004. 

• Lecturer in Microbial Diversity, The Rockefeller University, Jan., 1998 and Jan. 2000. 

• Co-instructor in Marine Microbial Ecology, The Bermuda Biological Station for Research, July 1999- 
2002. 

• Lecturer in Microbial Phytogeny: Linkages to Processes and Biogeochemistry (Microbiology 670/470), 
University of Tennessee, Feb. 1998. 

• Instructor, Module Organizer, University of Southern California/ ONR. Advanced Techniques 
Course Molecular Biology and Biochemistry of Marine Organisms, July 1992. 

• Instructor, Marine Biological Laboratory (Woods Hole, MA) course Molecular Probes in Marine 
Ecology, summer, 1989. 

Public Outreach: 

• Advisor for American of Microbiology/ Public Broadcasting 
Production: "Intimate Strangers, Unseen Life on Earth" 

• Member of Microbial Literacy Collaborative, an American Society 
for Microbiology organization dedicated to disseminating knowledge 
about microbiology to the general public 

Invited talks (recent): 

• Opening Lecture, Tenth International Symposium on Microbial Ecology, Cancun, Mexico, Aug., '04, 
Winogradsky Revisited. 

• Invited Lecture, Ecological Society of America Annual Meeting, Portland, OR. Aug. '04, Temporal and 
spatial structure in bacterioplankton poplulations in the western Sargasso Sea. 

• Invited Lecture, Census of Marine Life Workshop, Monterey, CA, Nov. '03. High Throughput Methods 
for the Cultivation of Marine Bacteria. 

• Invited Lecture, ASM Annual Meeting, Washington D.C., May, '03, The Uncultured Microbial Majority. 

• Invited Lecture, ASM Annual Meeting, Washington D.C., May, '03, A Targeted High-throughput 
cultivation strategy. 

University Service: 

• Director, Molecular and Cellular Biology Program, 2001-2004 
University Committees: 

• Chair of search committee for director of the Center for Gene Research and Biotechnology, 1999-2000. 

• Center for Gene Research and Biotechnology Advisory Board member and representative to Research 
Office, 1997 to present. 

• Vice Provost for Research Search Committee member, 1996-1997. 

• Rice Endowed Chair in Entomology Search and Oversight Committee member, 1997 to present. 



1997-1998 
1997 
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• Chair, OSU Research Council 1994-1995. 

• OSU Research Council member, 1993-1994. 



Sponsored Seminars and Symposia: 

• Co-organizer of Center for Gene Research and Biotechnology Annual Retreat, Sept. 20-21, 1996. 

• Organized annual banquet (Chair of Banquet Committee), Phi Kappa Phi., 1997. 

• Organized Sugihara symposium Microbial Diversity at Oregon State University, Feb. 1996. 

• Initiated and organized departmental seminar series, including visit and seminar by Fran Paerler of 
New England Biolabs,1996-1997. 

• Organized Department of Microbiology Summer Research Symposium, 1996. 

Professional Societies: 

• American Society for Microbiology 

• American Association for the Advancement of Science 

• American Society of Limnology and Oceanography 



Professional Service: 

• Associate Editor, Environmental Microbiology 2000-present 

• Editorial Board of Applied and Environmental Microbiology 1997-1999 

• Grant Panel member, NSF Ecology division 1998 

• Member of Ocean Drilling Program Deep Biosphere Planning Group 1997-1998 

• Chair for the Division of Systematic and Evolutionary Biology, 

American Society for Microbiology 1990 

• Chair Elect for the Division of Systematic and Evolutionary 

Biology of the American Society for Microbiology 1989 

Grant Panelist (Recent): 

• Grant Review Panel: NSF/ USD A Small Genomes, March, '03 

• Grant Review Panel: NOAA Ocean Sciences, Dec. '03. 

Ad-hoc Manuscript Reviews: Nature, Marine Ecology Progress Series, Science, International Journal of 
Systematic Bacteriology, Limnology and Oceanography, Applied and Environmental Microbiology, 
Proceedings of the National Academy of Sciences U.S. A. 

Ad-hoc Proposal Reviews: FASEB, NSF, U. S. Environmental Protection Agency, U. S. Dept. of 
Agriculture, Australian Research Council. 

Patents: U.S. No. 10/053,243. High Throughput Microbial Culturing. Pending 
Research Grants: 

1986-1988 NSF Postdoctoral Fellowship Award, Division of Biotic Systems and Resources: 

"Phylogenetic Analysis of Marine Picoplankton by rRNA Gene Cloning and Sequencing" 

1988 OSU Research Council Grant: "Chloroplast Phylogeny by 16S Ribosomal RNA Gene 

Sequence Analysis," $4,000. 

1988- 1989 Oregon Medical Research Foundation Grant: "Molecular Phylogeny of Two Protozoan 

Pathogens: Pneumocystis carinii and Leishmania sp.," $10,953. 

1989- 1990 National Science Foundation Grant: "In situ Analyses of the Distributions and Phylogeny 

of Cultivatable and Non-cultivatable Planctomycetales Using Phylogenetic Group-Specific 
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RNA Probes," BSR-8818167, $110,000. 

1989-1995 National Dairy Promotion and Research Board Grant: "Probes for Conserved 16S 

Ribosomal RNA (rRNA) Gene Sequences to Isolate Lactococcus cremoris from Nature, " 
$210,000. 

1991 National Science Foundation Grant: "In situ Analyses of the Distributions and Phylogeny 

of Cultivatable and Non-cultivatable Planctomycetales Using Phylogenetic Group-Specific 
rRNA Probes," BSR-9020477, $75,000. 

1991 OSU Research Council Grant: "In Situ Microsopic Quantification of Low-copy Number 
Ribosomal RNA Targets by SIT Camera Image Analysis," $4,000. 

1991-1993 National Science Foundation Grant: "Molecular Analyses of the Population Dynamics and 
Activity of a Newly Identified Bacterioplankton Group," OCE-9016373, $368,460. 

1992 Oregon Advanced Computing Institute Grant: "gRNAid: An Interactive Graphics Program 
for Predicting the Secondary Structures of Ribonucleic Acid Molecules," 40-0140, $19,315. 

1993-1997 Department of Energy, Ocean Margins Program Grant: "The Dynamics of Carbon 

Exchange in Vertically Stratified Coastal Bacterioplankton Communities." FG0693ER61697, 
$386,189. 

1994 Joint Oceanographic Institutions Grant: "Genetic Evidence for Endolithic Microbial Life 

colonizing Basaltic glass/ sea water interfaces"; Co-P.I. with M. Fisk. $12,000. 

1995-1998 National Science Foundation Grant: "Antarctic Lake Ice Microbial Consortia: Origin, 
Distribution and Growth Physiology", PP-9419423, $108,327 (to S.J.G.). 

1995-1996 OSU Research Council Grant: The Kinetics of Gene Amplification and Chimera Formation 
in the Polymerase Chain Reaction", $7,973. 

1997-1999 National Science Foundation Grant: "Evidence for Endolithic Microbes in Oceanic Basalts", 
Co-P.I. with M. Fisk, OCE-9618728, $38,350 (to S.J.G.). 

1997-2000 National Science Foundation Grant: "Interactions Between Bacterioplankton Communities 
and Dissolved Substrates at the Bermuda Atlantic Time Series Study Station", OCE- 
9618530, with C. Carlson, co.-P.L, $293,000 (to S.J.G.). 

1997- 2000 National Science Foundation Grant: "Spatial, Temporal and Phylogenetic Structure of 

Bacterioplankton Communities in Crater Lake, Oregon", DEB-9709012, with E. Urbach, co- 
P.I., $303,588. 

1998- 1999 National Science Foundation Grant: "Development of Capability to Measure Proxies of 

Microbial Activity Within Ocean Crust", BES-9729672, with James Cowen (P.L), F. Kenig 
and H.P. Johnson. $54,881 (to S.J.G.). 

1998-2001 National Science Foundation Grant: "Time-series Responses to a Mid-Ocean Ridge 

Volcanic Event: Juan de Fuca and Gorda Ridges", OCE-9902048, with James Cowen (P.L). 
$94,298 (to SJ.G.). 

1998-1999 National Science Foundation Grant: Quantitative Imaging of the Smallest Bacterioplankton 
Cells, SAR11, at the Theoretical Limits of Light Microscopy. OCE-9816489. $19,500. 
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1999-2001 Collaborative Research on Bacterioplankton Biology and Biochemistry at the Bermuda 

Atlantic Time-series Station: An Oceanic Microbial Observatory. MCB-9977930. $299,990 
(to SJ.G.). 

1999-2001 National Science Foundation Major Research Instrumentation Grant: Advanced Microbe 
Isolation Laboratory. OIA-9977469. $338,940. 

1999- 2001 Murdock Charitable Trust Grant: Microbe Discovery by Solid State Cytometry with 

Fluorescent DNA Probes. $306,730. 

2000- 2002 Oregon Sea Grant: Are Algicidal Bacterial Important in Controlling Phytoplankton 

Blooms in Oregon Coastal Waters? R/HAB-01. $236,564. 

2000- 2003 National Science Foundation Grant: Effects of Microbial Activity on Rates of Basalt 

Alteration. With co-P.I. M Fisk. OCE-0085436. $358,999. 

2001 Diversa Corporation Contract: High Throughput Culturing. $225,000. 

2001- 2006 National Science Foundation Proposal. IGERT- The Earth's Subsurface Biosphere. Co-PL 

With M. Fisk. $2,674,860. 

2001- 2004 Sloan Foundation Grant: Professional Masters Degree Programs in the Sciences at Oregon 

State University. $400,924. 

2002- 2005 National Science Foundation Grant: Coastal Bacterioplankton Systematics: A High 

Throughput Culturing Approach. DEB 0207085. $275,999. 

2003- 2008 National Science Foundation Grant: Collaborative Research Linking Microbial Discovery 

to Biogeochemical Processes: An Oligotrophic Oceanic Microbial Observatory. MCB- 
0237713. $306,102. 

2003-2004 National Science Foundation Grant: Collaborative Research: Microbial Diversity and 

Function in the Permanently Ice-covered Lakes of the McMurdo Dry Valleys, Antarctica. 
$99,989. 

2005-2010 Gordon and Betty Moore Foundation: Genomics of Oceanic Bacteria. $3,640,000. 



Peer Reviewed Articles (published, in press, submitted, or near submission, in reverse order): 

83. Giovannoni, S.J., L. Bibbs, S. Givan, J. Tripp, M. Podar, M. Noordeweir, J. Eads, E. Mathur , M. S. 
Rapp£, and K. L. Vergin. Genome Streamlining in a Cosmopolitan Oceanic Bacterium. In Preparation. 

82. Cho, J. C , M. D. Stapels, R. M. Morris, K. L. Vergin, D. F. Barofsky, and S. J. Giovannoni. Proteomics 
links novel aerobic anoxygenic phototroph isolates to the oceanic metagenome. In review. 

81. Giovannoni, S.J., D. Barofsky, L. Bibbs, J.C. Cho, R. Desiderio, S. Laney, E. Mathur, M. S. Rapp6, M. 
Staples, and K. L. Vergin. Proteorhodopsin phototrophy in the Ubiquitous Marine Bacterium 
Pelagibacter ubique. In review. 

80. Morris, R.M., J.C. Cho, M.S. Rapp6, K. L. Vergin, C. A. Carlson, and S. J. Giovannoni. 
Bacterioplankton responses to deep season mixing in the Sargasso Sea. Limnol. Oceanog. In review. 



5 



79. Page, K. A., S. A. Connon, and S J. Giovannoni. 2004. Oligotrophy isolates from Crater Lake, Oregon 
are representative of dominant freshwater bacterioplankton. 70:6542-6550. 

78. Staples, M.D. J. C. Cho, S. J. Giovannoni and D.F. Barofsky. 2004. Proteomic analysis of novel 
marine bacteria using MALDI and ESI spectrometry. J. Biomolec. Techniques. 15:191-198. 

77. Connon, S. A., A. Tovanabootr, M . Dolan, K. Vergin, S.J Giovannoni, and L. Semprini. 2004 
Bacterial Community Composition Determined by Culture Independent and Dependant Methods during 
Propane Stimulated Bioremediation in Trichloroethene Contaminated Groundwater. Environ. Microbiol. 
7:165-78. 

76. Carlson, C.A., S. J. Giovannoni, D. A. Hansell, S. J. Goldberg, R. Parsons, K. Vergin. 2004. 
Interactions between DOC, microbial processes, and community structure in the mesopelagic zone of the 
northwestern Sargasso Sea. Limnol. Oceanog. 49:1073-1083. 

75. Morris, R. M., M. S. Rapp£, E. Urbach, S. A. Connon, S. J. Giovannoni. 2004. Prevalence of the 
Chloroflexi-Related SAR202 Bacterioplankton Cluster Throughout the Mesopelagic Zone and Deep 
Ocean. Environ. Microbiol. 70:2836-2842. 

74. Cho, J. C, and S. J. Giovannoni. 2004. Robiginitalea biformata gen. nov., sp. nov., a new marine 
bacterium in the family Flavobacteriaceae that contains higher G+C composition. Int. J. Syst. Evol. 
Microbiol. 54:1101-1106. 

73. Cho, J. C, and S. J. Giovannoni. 2004. Oceanicola granulosus gen. nov., sp. nov. and Oceanicola 
batsensis sp. nov., poly-beta-hydroxybutyrate-producing marine bacteria in the order "Rhodobacterales". 
Int. J. Syst. Evol. Microbiol. 54:1129-1136. 

72. Cho, J. C, K. L. Vergin, R. M. Morris, and S. J. Giovannoni. 2004. Discovery of the novel bacterial 
phylum Lentisphaerae with cultivation of Lentisphaera araneosa gen. nov., sp. nov, a transparent 
exopolymer producing marine bacterium. Environ. Microbiol. 6:611-621. 

71. Cho, J. C, and S. J. Giovannoni. 2004. Cultivation and growth characteristics of a diverse group of 
oligotrophic marine Gammaproteobacteria. Appl. Environ. Microbiol. 70:432-440. 

70. Cho, J.C., and S. J. Giovannoni. 2003. Fulvimarina pelagi gen. nov., sp. nov., a marine bacterium that 
forms a deep evolutionary lineage of descent in the order "Rhizobiales". Int. J. Syst. Evol. Microbiol. 
53:1853-1859. 

69. Cho, J.-C, and S. J. Giovannoni. 2003. Parvularcula bermudensis gen. nov., sp. nov., a marine 
bacterium that forms a deep branch in the a_-Proteobacteria. Int. J. Syst. Evol. Microbiol. 53:1031-1036. 

68. Cho, J.-C, and S. J. Giovannoni. 2003. Croceibacter atlanticus gen. nov., sp. nov., a novel marine 
bacterium in the family Flavobacteriaceae. Syst. Appl. Microbiol. 26:76-83. 

67. Cowen J., S.J. Giovannoni, H. P. Johnson, F. Kenig, D. Butterfield, M. Rapp£, Hutnak, and P. Lam. 
2003. Fluids from aging ocean crust that support microbial life. Science 299:120-123. 

66. Morris, R.M., M. S. Rapp6, S. A. Connon, K. L. Vergin, W. A. Siebold, C. A. Carlson, S. J. Giovannoni. 
2002. High Cellular Abundance of the SAR11 Bacterioplankton Clade in Seawater. Nature 420: 806-810. 

65. Rapp6, M.S , S. Connon, S., K. L. Vergin, and S. J. Giovannoni. 2002. Cultivation of the ubiquitous 
SAR11 marine bacterioplankton clade. Nature 418:630-631. . 
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64. Carlson, C. A., S. J. Giovannoni, D. A. Hansell, S. J. Goldberg, R. Parsons, M. P. Otero, K. Vergin, 
and B. R. Wheeler. 2002. The effect of nutrient amendments on bacterioplankton production, community 
structure, and DOC utilization in the northwestern Sargasso Sea. Aquatic Microbial Ecology. 30:19-36. 

63. Connon, S and S J Giovannoni. 2002. High throughput methods for culturing microorganisms in 
very low nutrient media yield diverse new marine isolates. Appl. Environ. Microbiol. 68: 3878-3885. 

62. Vergin, K. L., M.S. Rapp6, , and S.J. Giovannoni. 2001. Streamlined method to analyze 16S rRNA 
gene clone libraries. Biotechniques 30:938-944. 

61. Urbach, E., K. L. Vergin, L. Young, A. Morse, G. Larson and SJ. Giovannoni. 2001. Unusual 
bacterioplankton in Crater Lake Oregon. Limnol. Oceanog. 46:557-572. 

60. Lanoil, B. D., C Carlson and S J. Giovannoni. 2000. Bacterial chromosomal painting for in situ 
monitoring of cultured marine bacteria. Environ. Microbiol. 2:654-665. 

59. Rappe, M.S., Vergin, K. and Giovannoni, S.J. 2000 Phylogenetic comparisons of a coastal 
bacterioplankton community with its counterparts in open ocean and freshwater systems. FEMS Microb 
Ecol 33: 219-232. 

58. Gordon, D.A., J. Priscu and S.J. Giovannoni, 2000. Origin and phylogeny of microbes living in 
permanent Antarctic lake ice. Microb. Ecol. 39:197-202. 

57. Janson, S„ Bergman, B., Carpenter, E.J., Giovannoni, S.J., Vergin, K. 1999. Genetic Analysis of the 
Marine Diazotrophic cyanobacterium Trichodesmium. FEMS Microb. Ecol. 30:57-65. 

56. Fisk, M. and S J. Giovannoni. Sufficient conditions for a deep biosphere on Mars. 1999. Journal of 
Geophysical Research, Planets. 104:11,805-11,815. 

55. Urbach, E., K. L. Vergin and S.J. Giovannoni. 1999. Immunochemical detection and isolation of DNA 
from metabolically active bacteria. Appl. Environ. Microbiol. 65:1207-1213. 

54. McAshan, S.K., K.L. Vergin, S J. Giovannoni and D.S. Thaler. 1999. Interspecies hybridization in the 
Enterococci via conjugation of chromosomal vancomycin resistance. Microb. Drug 
Resis. 5:101-112. 

53. Mauel, M.J., SJ. Giovannoni and J.L. Fryer. 1999. Phylogenetic analysis of Piscirickettsia salmonis 
isolates by 16S ribosomal DNA sequencing. Dis. Aquat. Org. 35:115-123. 

52. Rapp£, M.S., D.A. Gordon, K. L. Vergin, S.J. Giovannoni. 1999. Phylogeny of Actinobacteria-related 
SSU rRNA gene clones recovered from marine bacterioplankton. Syst. Appl. Microbiol 22:106-112. 

51. Suzuki, M., M.S Rapp6, and SJ. Giovannoni. 1998. Kinetic bias in estimates of coastal picoplankton 
community structure obtained by measurements of SSU rDNA PCR-amplicon length heterogeneity. 
Appl. Environ. Microbiol. 64:4522-4529. 

50. Fisk, M. R„ S J. Giovannoni and I. Thorseth. 1998. Alteration of oceanic volcanic glass: textural 
evidence for microbial activity. Science 281: 978-980. 

49. Vergin, K., E. Urbach, J. L. Stein, E. F. DeLong and S J. Giovannoni. 1998. Screening of a fosmid 
library of marine environmental genomic DNA fragments reveals four clones related to Planctomycetales. 
Appl. Environ. Microbiol. 64: 3075-3078. 
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48. Urbach, E., C. Schindler and S.J. Giovannoni. 1998. A PCR fingerprinting technique to distinguish 
isolates of Lactococcus lactis. FEMS Microbiol. Let. 162:111-115. 

47. Rapp£, MS., M. Suzuki, K. L. Vergin, S.J. Giovannoni. 1998. Phylogenetic diversity of 
ultraplankton plastid SSU rRNA genes recovered in environmental nucleic acid samples from the Pacific 
and Atlantic coasts of the United States. Appl. Environ. Microbiol. 64:294-303. 

46. Priscu, J., C.H. Fritsen, E. Adams, SJ. Giovannoni, H. Paerl, C. McKay, D. Gordon, and B. Lanoil. 
1998. Perenial Antarctic lake ice: an oasis for life in a polar desert. Science 280:2095-2098. 

45. Wright, T. D., K. Vergin, P. Boyd and S.J. Giovannoni. 1997. A novel 3-Proteobacterial lineage from 
the lower ocean surface layer. Appl. Environ. Microbiol. 63:983-989. 

44. Lanoil, B. D., and S.J. Giovannoni. 1997. Identification of bacterial cells by chromosomal painting. 
Appl. Environ. Microbiol. 63:1118-1123. 

43. Suzuki, M., M. S. Rapp6, Z.W. Haimberger, H. Winfield, N. Adair, J. Strttbel and S.J. Giovannoni. 
1997. Bacterial diversity among SSU rDNA gene clones and cellular clones from the same seawater 
sample. Appl. Environ. Microbiol. 63:983-989. 

42. Urbach, E., B. Daniels, M. S. Salama W. E. Sandine and SJ. Giovannoni. 1997. The Idh phylogeny for 
environmental isolates of Lactococcus lactis is consistent with the rRNA genotypes, but not with 
phenotypes. Appl. Environ. Microbiol. 63: 694-702. 

41. Field, K.G., N. Adair, D.A. Gordon, M. S. Rapp6 and S.J. Giovannoni. 1997. Genetic diversity and 
depth-specific speciation within the SAR11 cluster, a marine bacterial lineage. Appl. Environ. Microbiol. 
63:63-70. 

40. Rapp6, M.S., P.F. Kemp and SJ. Giovannoni. 1997. Phylogenetic diversity of marine coastal 
picoplankton 16S rRNA genes cloned from the continental shelf off Cape Hatteras, N.C. Limnol. 
Oceanog. 42:811-826. 

39. Mauel, M.J., S J. Giovannoni and J.L. Fryer. 1997. Development of polymerase chain reaction assays 
for detection, identification and differentiation of Piscirickettsia salmonis. Dis. Aquat. Org. 26:189-195. 

38. Lanoil, B. D., L.M. Ciufettii and S J. Giovannoni. 1996. The marine bacterium Pseudalteromonas 
haloplanktis has a complex genome structure composed of two separate genetic units. Genome Res. 
6:1160-1169. 

37. Giovannoni, S.J., M. S. RappS, K. L. Vergin and N. Adair. 1996. 16S rRNA genes reveal stratified 
open ocean bacterioplankton populations related to the Green Non-Sulfur bacteria. Proc. Natl. Acad. Sci. 
U.S.A. 93:7979-7984. 

36. Gordon, D.A. and S J. Giovannoni. 1996. Stratified microbial populations related to Chlorobium and 
Fibrobacter detected in the Atlantic and Pacific oceans. Appl. Environ. Microbiol. 62:1171-1177. 

35. Suzuki, M., and S. J. Giovannoni. 1996. Bias caused by template annealing in the amplification of 
16S rRNA genes by PCR. Appl. Environ. Microbiol. 62:625-630. 

34. Giovannoni, S.J. , M. R. Fisk, Mullins, T.D. and Furnes, H. 1996. Genetic evidence for endolithic 
microbial life colonizing basaltic glass/ seawater interfaces. Proceedings of the Ocean Drilling Program 
148:207-214. 



8 



33. Rappe', M. S., Kemp, P. F., and S J. Giovannoni. 1995. Chromophyte plastid 16S ribosomal RNA 
genes found in a clone library from Atlantic Ocean seawater. J. Phycol. 31:979-988 

32. Salama, M. S., Musafija-Jeknic, T., W. E. Sandine, and S. J. Giovannoni. 1995. An ecological study of 
lactic acid bacteria: isolation of new strains of Lactococcus including Lactococcus lactis subspecies cremoris. 
Journal of Dairy Science 78:1-14. 

31. Salama, S., W. Sandine, and S. J. Giovannoni. 1995. A milk-based method for detecting 
antimicrobial substances produced by lactic acid bacteria. J. Dairy Sci. 78:1219-1223. 

30. Mullins, T. D., T. B. Britschgi, R. L. Krest, and S. J. Giovannoni. 1995. Genetic comparisons reveal 
the same unknown lineages in Atlantic and Pacific bacterioplankton communities. Limnol. Oceanog. 
40:148-158. 

29. Salama, S., W. Sandine, and S. J. Giovannoni. 1993. Isolation of Lactococcus lactis subsp. cremoris. 
from nature by colony hybridization with rRNA probes. Appl. Environ. Microbiol. 57:1313-1318. 

28. Lovley, D. R., S. J. Giovannoni, D. C. White, J. E. Champine, E. Phillips, Y. A. Gorby, and S. 
Goodwin. 1993. Geobacter metallireducens gen. nov. sp. nov., a microorganism capable of coupling the 
complete oxidation of organic compounds to the reduction of iron and other metals. Archive Microbiol. 
159:336-344. 

27. Cary, S. C. and S. J. Giovannoni. 1993. Transovarial inheritance of endosymbiotic bacteria in deep- 
sea vesicomyid clams. Proc. Natl. Acad. Sci. USA 90:5695-5699. 

26. Cary, S. C, W. Warren, E. Anderson, and S. J. Giovannoni. 1993. Identification and localization of 
bacterial endosymbionts in hydrothermal vent taxa with symbiont-specific PCR amplification and in situ 
hybridization techniques. Mol. Mar. Biol. Biotech. 2:251-262. 

25. Liesack, W., R. Soller, T. Stewart, H. Haas, S. J. Giovannoni, and E. Stackebrandt. 1992. The 
influence of tachytelically (rapidly) evolving sequences on the topology of phylogenetic trees -intrafamily 
relationships and the phylogenetic position of the Planctomycetaceae as revealed by comparative analysis 
of 16S ribosomal RNA sequences. System. Appl. Microbiol. 15:357-362. 

24. Lane, D. J., A. P. Harrison, Jr., D. Stahl, B. Pace, S. J. Giovannoni, G. J. Olsen, and N. R. Pace. 1992. 
Evolutionary relationships among sulfur- and iron-oxidizing eubacteria. J. Bacteriol. 174(l):269-278. 

23. Fryer, J. L., C. N. Lannan, S. J. Giovannoni, and N. D. Wood. 1992. Piserickettsia salmonis gen. nov., sp. 
nov., the causative agent of an epizootic disease in salmonid fishes. Int. J. Sys. Bacteriol. 42:120-126. 

22. Field, K. G., S. M. Landfear, and S. J. Giovannoni. 1991. 18S rRNA sequences of Leishmania enrietti 
promastigote and amastigote. International Journal for Parasitology 21:483-485. 

21. Britschgi, T. B. and S. J. Giovannoni. 1991. Phylogenetic analysis of a natural marine 
bacterioplankton population by rRNA gene cloning and sequencing. Appl. Environ. Microbiol. 57:1707- 
1713. 

20. Salama, S., W. Sandine, and S« J. Giovannoni. 1991. Development and application of oligonucleotide 
probes for identification of Lactococcus lactis subsp. cremoris. Appl. Environ. Microbiol. 57:1313-1318. 

19. Gutenberger, S. K., S. J. Giovannoni, K. G. Field, J. L. Fryer, and J. S. Rohovec. 1991. A phylogenetic 
comparison of the 16S and rRNA sequence of the fish pathogen, Renibacterium salmoninarum, to Gram- 
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positive bacteria. FEMS Microbiol. Let. 77:151-156. 



18. Giovannoni, S. J., E. F. DeLong, T. M. Schmidt, and N. R. Pace. 1990. Tangential flow filtration and 
preliminary phylogenetic analysis of marine picoplankton. Appl. Environ. Microbiol. 56:2572-2575. 

17. Giovannoni, S. J., T. B. Britschgi, C. L. Moyer, and K. G. Field. 1990. Genetic diversity in Sargasso 
Sea bacterioplankton. Nature 345:60-63. 

16. Huss, A. R. and S. J. Giovannoni. 1989. Primary structure of the chloroplast small subunit ribosomal 
RNA gene from Chlorella vulgaris. Nucleic Acids Res. 22:9487. 

15. Turner, S., T. Burger-Wiersma, S. J. Giovannoni, L. R. Mur, and N. R. Pace. 1989. The relationship of 
a prochlorophyte, Prochlorothrix hollandica, to green chloroplasts. Nature 337:380-382. 

14. Weisburg, W. G., S. J. Giovannoni, and C. R. Woese. 1989. The Deinococcus - Thermits phylum and the 
effect of rRNA composition on phylogenetic tree construction. System. Appl. Microbiol. 11:128-134. 

13. Bomar, D., S. J. Giovannoni, and E. Stackbrandt. 1988. A unique type of eubacterial 5S rRNA in 
members of the order Planctomycetales. J. Mol. Evol. 27:121-125. 

12. Distel, D. L., D. L. Lane, G. J. Olsen, S. J. Giovannoni, B. Pace, N. Pace, D. Stahl, and H. Felbeck. 
1988. Sulfur-oxidizing bacterial endosymbionts: Analysis of phylogeny and specificity by 16S ribosomal 
RNA sequences. J. Bacteriol. 170:2506-2510. 

11. Field, K. G., G. J. Olsen, D. J. Lane, S. J. Giovannoni, M. T. Ghiselin, E. C. Raff, N. R. Pace, and R. A. 
Raff. 1988. Molecular phylogeny of the animal kingdom based on 18S ribosomal RNA sequences. 
Science 239:748-753. 

10, Giovannoni, S. J., E. DeLong, G. J. Olsen, and N. R. Pace. 1988. Phylogenetic group-specific 
oligodeoxynucleotide probes for in situ microbial identification. J. Bacteriol. 170:720-726. 

9. Giovannoni, S. J., S. Turner, G. T. Olsen, S. Barns, D. T. Lane, and N. R. Pace. 1988. Evolutionary 
relationships among cyanobacteria and green chloroplasts. J. Bacteriol. 170:3584-3592. 

8. Karl, D. M., G. T. Taylor, J. A. Novitski, H. W. Jannasch, C. O. Wirsen, N. R. Pace, D. J. Lane, G. J. 
Olsen, and S. J. Giovannoni. 1988. A microbiological study of Guymas basin high temperature 
hydrothermal vents. J. Deep Sea Res. 35:777-791. 

7. Giovannoni, S. J., E. Schabtach, and R. W. Castenholz. 1987. Isosphaera pallida, gen. and comb, nov., a 
gliding, budding eubacterium from hot springs. Arch. Microbiol. 147:276-284. 

6. Giovannoni, S. J., W. Godchaux, E. Schabtach, and R. W. Castenholz. 1987. Cell wall and lipid 
composition of Isosphaera pallida, a budding eubacterium from hot springs. J. Bacteriol. 169:2702-2707. 

5. Giovannoni, S. J., D. M. Ward, N. P. Revsbech, and R. W. Castenholz. 1987. Obligately phototrophic 
Chloroflexus: primary production in anaerobic, hot spring microbial mats. Arch. Microbiol. 147:80-87. 

4. Pierson, B. K., S. J. Giovannoni, D. L. Stahl, and R. W. Castenholz. 1985. Heliothrix oregonensis, gen. 
nov., sp. nov., a phototrophic filamentous gliding bacterium containing bacteriochlorophyll a. Arch. 
Microbiol. 142:164-167. 

3. Pierson, B. K., S. J. Giovannoni, and R. W. Castenholz. 1984. Physiological ecology of a gliding 
bacterium containing bacteriochlorophyll a. Appl. Environ. Microbiol. 47:576-584. 
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2. Giovannoni, S. J., and L. Margulis. 1981. A red Benekea from Laguna Figueroa, Baja, California. 
Microbios 30:47-63. 

1. Margulis, L., E. S. Barghoorn, D. Ashendorf, S. Banerjee, D. Chase, S. Francis, S. J. Giovannoni, and 
J. Stolz. 1980. The microbial community in the layered sediments at Laguna Figueroa, Baja, California. 
Precam. Res. 11:93-123. 



Reviews, Book Chapters and Other Non-Peer Reviewed Publications: 
14. Giovannoni, S. J. 2004. Oceans of bacteria. Nature 430:515-516. 

13. M. Rappe' and Giovannoni, S. J. 2003 The Uncultured Microbial Majority. Ann. Rev. Microbiol. 
57:369-394. 

12. Giovannoni, S. J., and M. Rappe'. 2000. Evolution, Diversity and Molecular Ecology of Marine 
Prokaryotes. p. 47-84. In Kirchman, D. (ed.) Microbial Ecology of the Oceans. John Wiley & Sons, Inc., 
New York. 

11. Giovannoni, S. and M. Rappe'. 1999. Microbial Diversity: It's a New World. The NEB Transcript. 
10:1-4. 

10. Giovannoni, S. J., M. Rappe', D. Gordon, E. Urbach, M. Suzuki, and K. G. Field. 1996. Ribosomal 
RNA and the evolution of bacterial diversity, p. 63-85. In Roberts, D. McL.,Sharp, P. Alderson, G. and 
Collins, M. (ed.) "Evolution of Microbial Life". Society for General Microbiology Symposium 54. 
Cambridge University Press. 

9. Giovannoni, S. J., T. Mullins, and K. G. Field. 1995. Microbial diversity in marine systems: rRNA 
approaches to the study of unculturable microbes. In: "Molecular Ecology of Aquatic Microbes," ed. Ian 
Joint, Springer- Verlag, Berlin-Heidelburg-New York-Tokyo. 

8. Giovannoni, S. J., and S. C. Cary. 1993. Probing marine systems with ribosomal RNAs. 
Oceanography 6:95-104. 

7. Giovannoni, S. J., N. Wood, and V. A. R. Huss. 1993. Molecular Phytogeny of Oxygenic Phototrophic 
Cells and Organelles from Small-Subunit Ribosomal RNA Sequences. Pages 159-170. In: Origins of 
Plastids, R. A. Lewin (ed.) Chapman and Hall, NY, NY. 

6. Staley, J. T., J. L. Fuerst, S. Giovannoni, and H. Schlesner. 1991. The Order Planctomycetales and the 
Genera Planctomyces, Pirellula, Gemmata and Isosphaera. Pages 3710-3731. In: M. Dworkin et al. (eds.) The 
Prokaryotes. Volume 4, Chapter 203. Springer- Verlag, New York. 

5. Giovannoni, S. J. 1991. The polymerase chain reaction. Pages 177-203. In: E. Stackebrandt, and M. 
Goodfellow (eds.) Modern Microbiological Methods: Nucleic Acids Techniques in Bacterial Systematics. 
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ABSTRACT i 

The Ribosomal Database Project (RDP-II) provides ] 

the research community with aligned and annotated ( 

rRNA gene sequences, along with analysis services ( 

and a phylogenetically consistent taxonomic frame- < 

work for these data. Updated monthly, these services I 
are made available through the RDP-II website (http:// 
rdp.cme.msu.edu/). RDP-II release 9.21 (August 2004) 

contains 101 632 bacterial small subunit rRNA gene 1 

sequences in aligned and annotated format. High- 1 

throughput tools for initial taxonomic placement, j 

identification of related sequences, probe and primer ( 
testing, data navigation and subalignment download 

are provided. The RDP-II email address for questions , 

or comments is rdpstaff@msu.edu. ; 



DESCRIPTION 

Release 9 introduces substantial changes to the Ribosomal 
Database Project (RDP). These changes are in response to 
the rapidly increasing number of available ribosomal RNA 
gene sequences (rRNA sequences) and the trend toward high- 
throughput rRNA sequencing with the concomitant need for 
high volume rRNA analysis tools. This paper describes changes 
since the 2003 description (1). Details about the data and 
analysis services can be found at the RDP-II website (http:// 
rdp.cme.msu.edu/). 

Sequences. The RDP obtains bacterial rRNA sequences 
from the International Nucleotide Sequence Databases 
(INSD: GenBank/EMBL/DDBJ) on a monthly basis. These 
sequences are aligned against a general bacterial rRNA 
model using a modified version of RNACAD (2), a Stochastic 
Context Free Grammar (SCFG)-based rRNA aligner that 
directly incorporates rRNA secondary structure information 



into its internal model. This aligner is trained on a set of 
high-quality hand-aligned sequences and incorporates the 
conserved bacterial secondary structure model of Gutell and 
co-workers (3). As of release 9.21 (August 2004), the database 
contained 101 632 total small subunit bacterial rRNA 
sequences. Of these, 39772 were near full-length (5*1200 
bases), 54316 came from uncultured organisms and 4431 
were from type strains of validly named bacterial species. 

Taxonomy. All Release 9 tools use a new hierarchical 
framework (RDP Hierarchy) differing significantly from the 
hierarchy provided with previous RDP releases. The RDP 
Hierarchy is based on the new phylogenetically consistent 
higher-order bacterial taxonomy proposed by Garrity er al. 
(4) (http://dx.doi.org/10.1007/bergeysoutline). This .hierarchy 
provides order to the collection. It provides a phylogenetic 
framework into which to place results of the RDP analysis 
functions, and it provides an entry point for users looking for 
sequences from specific groups of organisms. New sequences 
are placed into the RDP Hierarchy using the RDP Classifier 
(see below). 

Analysis services 

The RDP analysis services have been completely revised to 
support the emerging trend toward high-throughput rRNA 
sequence analysis in microbial ecology and related disciplines. 
Three of the tools listed below incorporate the concept of data 
filters. The user can choose to apply up to three data filters on 
the view or analysis. By applying the three filters, the user can 

(i) include only environmental clone or only isolate sequences; 

(ii) include only sequences ^1200 bases in length (near 
full-length) or only shorter sequences; and (iii) include only 
sequences from type strains or only non-type strain sequences. 
The latter filter is of special importance since type strains act as 
a link between rRNA-based phylogeny and taxonomy. A more 
detailed description of each analysis service can be found at 
the RDP website. 
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Hierarchy Browser allows rapid navigation through the 
RDP sequence data. The browser presents views of the RDP 
sequences placed either in the RDP Hierarchy, or optionally 
in the NCB1 taxonomy hierarchy (5). While navigating, the 
browser automatically expands an appropriate number of hier- 
archical levels to fit the display. At any time, the user can 
select for later download of both individual sequences and 
those of entire taxa. Data filters can be applied at any time 
to limit the display to specific data subsets. In addition, the 
user can quickly search for words or phrases in the sequence 
definition line. This includes the organism name and strain 
designation (if available), culture collection identifiers and 
INSD nucleic acid accession identifiers. 

RDP Classifier places sequences into the RDP Hierarchy. 
Optimized for large query sets, it can be used to give an initial 
taxonomic placement for a single sequence or hundreds of 
sequences. The first result page summarizes the assignments 
on an interactive display similar to that of the Hierarchy 
Browser. Each node in the hierarchy lists the number of 
user queries assigned to that taxonomic rank. A confidence 
estimate is generated for each assignment, and the assignments 
are displayed only when the estimate is above a user-specified 
confidence threshold. At any time, the user can switch to a 
detail view showing the detailed taxonomic assignments and 
confidence scores for any subset of query sequences. These 
assignment details can also be downloaded in a file suitable for 
import into popular spreadsheet programs. 

Sequence Match is a complete re-implementation of the 
original Sequence Match method (1). Sequence Match finds 
sequences similar to a user's query sequences using a word 
matching strategy not requiring prior alignment. Sequence 
Match is more accurate than BLAST (6) at finding closely 
related rRNA sequences (Table 1). The related sequences 
returned by Sequence Match serve as a good starting point 
for more detailed examination of relatedness by classical 
phylogenetic or other methods. The initial result page presents 
a ^-nearest neighbor (£-NN) classifier assignment of the query 
sequences, A query is assigned to the lowest taxonomic rank 
that includes the k highest scoring database sequences. The 
value of A\ as well as the three data filters can be changed at 
will in this view. The user can switch from the summary £-NN 
view to a detailed results view for any query sequence. In this 
view, the top k database matches to the query are displayed in 
the RDP Hierarchy. In this mode, any subset of the matches 
can be selected for transfer to the Hierarchy Browser and later 



Table 1. rRNA search performance 

Program" Percentage of 16S rRNA queries 1 * returning the 
most similar sequence 1 " among the highest 
scoring N results 

N = I N= 10 N = 20 

Sequence Match 65 92 95 

BLAST 39 53 55 



"For both programs, the dataset consisted of 37456 near full-length (^1200 
base) rRNA sequences from the RDP release 9.20 alignment database. 
b Query sequences (1000) were selected at random from the dataset. 
c The most similar sequence to each query was determined by exhaustive pair- 
wise similarity comparison of each query against the dataset. In cases of a tie 
in pairwise similarity, we required only one of the ties to be returned by the 
program. 



download. A third view presents sets of results in a format 
suitable for download. 

Probe Match is a complete re-implementation of our 
previous Probe Match program (1). It uses a more efficient 
algorithm that is better suited to the amount of rRNA data 
available today and in the foreseeable future. The new Probe 
Match accepts a candidate primer/probe, optionally with ambi- 
guity codons, of up to 64 bases in length. While our previous 
version searched for hits within a specified number of mis- 
matches (Hamming distance), the new version finds hits with a 
combination of mismatches and insertion/deletions (edit dis- 
tance). Since some single insertion/deletion may be no more 
deleterious than a single mismatch, this new capability offers a 
significant improvement in the detection of potential cross- 
hybridization. In our previous implementation, the high 
percentage of partial sequences in the database limited the 
program's utility; it was difficult to determine if database 
entries failed to match simply because the sequence was 
incomplete in the target region. In this new version, the users 
can restrict analysis to database entries containing sequence 
data for the candidate probe target region of the rRNA mole- 
cule. (However, the search is not limited to this region of 
the molecule.) Similar to the other new programs, the results 
are displayed in an interactive version of the RDP Hierarchy. 
Each taxonomic rank lists the total number of sequences 
searched and the number matching within a user-specified 
edit distance. This maximum edit distance can be changed 
on the fly. For any hierarchy node, users can switch to a 
detail view listing the matching sequences. A third format 
is suitable for download and import into spreadsheet or 
other programs. 



RDP-II ACCESS AND CONTACT 

The RDP-II data and analysis services can be found at hup:// 
rdp.cme.msu.edu/. The RDP's mission includes user support. 
The address for email support is rdpstaff@msu.edu. Telephone 
support is available (+1 517432 4998). The RDP-II staff may 
also be contacted via fax (+1517 353 8957 AttmRDP) or 
regular mail. 
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ABSTRACT The complete nucleotide sequence of the 16S 
RNA gene from the rmB cistron of Escherichia coli has been 
determined by using three rapid DNA sequencing methods. 
Nearly all of the structure has been confirmed by two to six in- 
dependent sequence determinations on both DNA strands. The 
length of the 16S rRNA chain inferred from the DNA sequence 
is 1541 nucleotides, in close agreement with previous estimates. 
We note discrepancies between this sequence and the most re- 
cent version of it reported from direct RNA sequencing 
[Ehresmann, C., Stiegler, P., Carbon, P. & Ebel, J. P. (1977) FEBS 
Lett 84, 337-341]. A few of these may be explained by hetero- 
geneity among 16S rRNA sequences from different cistrons. No 
nucleotide sequences were found in the 16S rRNA gene that 
cannot be reconciled with RNase digestion products of mature 
16S rRNA. 

rRNA is becoming increasingly important in our current per- 
ception of the mechanism of action of ribosomes. In particular, 
we wish to understand more fully the functional role of 16S 
rRNA, the major molecular component of the small ribosomal 
subunit of prokaryotes. 16S rRNA has been directly implicated 
in discrimination of mRNA initiation sites (1, 2), tRNA binding 
(3, 4), and association of the two ribosomal subunits (5, 6). Full 
understanding of the workings of the ribosome consequently 
are becoming limited by our lack of knowledge of rRNA 
structure. Such information will also be essential for elucidation 
of the process by which these complex structures assemble 
themselves. Finally, we expect to gain insight into such diverse 
problems as protein-nucleic acid recognition and the evolu- 
tionary origin of the coding process from a more thorough 
knowledge of rRNA structure. 

Partial nucleotide sequences for 16S RNA have been pub- 
lished (7-9). However, numerous discrepancies in oligonu- 
cleotide sequences (6, 10, 11) and in the ordering of oligonu- 
cleotides (6, 12-14) have been reported by other investigators. 
Additional evidence for sequence errors comes from restriction 
endonuclease mapping of a 16S RNA gene: some cleavage sites 
predicted from the published sequences were not found (un- 
published data). Finally, chemical (6, 12, 15) and enzymatic 
(7, 8, 16) probes have been used to detect single-stranded re- 
gions of 16S RNA; the lack of agreement of these findings with 
the proposed secondary structure derived from the published 
sequences further suggests error in the primary structure. In 
view of the possible misinterpretation of various biochemical 
studies involving 16S RNA, we were prompted to reinvestigate 
its primary structure. 

The availability of rapid DNA sequence methods made 
possible the derivation of the nucleotide sequence of 16S RNA 
by direct sequencing of the 16S rRNA gene from the rmB cis- 
tron of Escherichia coli. We present here the complete se- 
quence of the portion of the cistron corresponding to mature 
16S RNA. Reported discrepancies with the previously published 



The publication costs of this article were defrayed in part by page 
charge payment This article must therefore be hereby marked "ad- 
vertisement" in accordance with 18 U. S. C. §1734 solely to indicate 
this fact. t 



sequence have been confirmed, and additional errors have been 
found involving oligonucleotide sequences, ordering of oligo- 
nucleotides, and, in one instance, the location of a larger section 
of the primary structure. No nucleotide sequences were found 
that cannot be accounted for from the RNase digestion products 
of mature 16S rRNA. 



METHODS 

Cloning and Mapping of DNA. The 16S rRNA gene from 
the rmB cistron of E. coli was cloned from two EcoRI restric- 
tion fragments of Xrt/ d 18 (17, 18) in the ColEl plasmid vector. 
Determination of the location of the 16S rRNA. sequences and 
restriction enzyme cleavage sites will be described elsewhere. 
The small Hin dill fragment from pER24 was excised and 
reinserted into the vehicle pBR322 (19) to give the recombinant 
plasmid pKK115 (see Fig. 1). All recombinant DNA experi- 
ments were carried out under PI, EK1 conditions, as specified 
in the National Institutes of Health guidelines. 

Generation of DNA Restriction Fragments. Plasmid DNA 
was isolated and digested on a milligram scale with EcoRI or 
Htndlll, and the inserted sequence was resolved from the 
cloning vector by sucrose gradient centrifugation (20). The 
purification of restriction enzymes Alu I, Hpa II, EcoRI, Sma 
I, Bgl II, and Hindlll will be described elsewhere; Hoe III, Hha 
I, fftnfl, and Mbo II were purchased from New England 
Biolabs; Atxi II and Taq I were from Bethesda Research Lab- 
oratories. The cloned fragments (20-40 pmol) were digested 
with the appropriate restriction endonuclease to obtain DNA 
fragments of suitable size for sequence determination and 
isolated as described below. 

Sequencing by Primed Synthesis with DNA Polymerase. 
For sequencing by either of the two methods developed by 
Sanger and coworkers (21, 22), restriction fragment primers 
were isolated by electrophoresis on 1% agarose gels (20 X 20 X 
0.3 cm) in 0.02 M Na acetate/0.04 M Tris acetate, pH 8.3/2 mM 
EDTA or on 8% acrylamide/0.26% bisacrylamide gels (20 X 
40 X 0.15 cm) in 0.09 M Tris borate, pH 8.3/2.5 mM EDTA. 
DNA bands were located by ethidium bromide staining, ex- 
cised, crushed, and eluted by shaking overnight at room tem- 
perature in 0.5 M NaCl/0.1 M Tris-HCl, pH 8.0/5 mM EDTA 
and recovered by ethanol precipitation. Template strands were 
prepared from pER18 DNA by digestion with Bgl II, which 
cleaves at a single site in this plasmid, followed by cesium 
chloride density gradient centrifugation after hearing and quick 
cooling in the presence of poly(U,G) (Miles) as described (23). 
Deoxynucleoside triphosphates labeled in the a position with 
32 P (200-300 Ci/mmol) were obtained from New England 
Nuclear. Dideoxynucleoside triphosphates were obtained from 
P-L Biochemicals; DNA polymerase I and its proteolytic 
fragment lacking 5' exonuclease activity were from Boehringer. 
Sequence determinations by the "plus and minus" method or 
by the "terminator" method were carried out as described (21, 
22). 
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FIG. 1. Physical maps of recombinant plasmids containing portions of the 16S rRNA gene from the rrnB cistron of E. coli. The positions 
of 16S and 23S rRNA and tRNA sequences are shown by black segments. Sites of cleavage by restriction endonucleases are denoted by arrows. 
Wild-type X sequences are shown by hatched segments, and the approximate position of the promoter for the rrnB cistron is indicated by the 
letter P, bp, base pairs. 
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FIG. 2. Schematic map of the 16S rRNA gene, showing the detailed positions of restriction endonuclease cleavages, and the DNA fragments 
actually used in the sequence determination. Sequences obtained by primed synthesis (21, 22) are indicated by horizontal bars ending in arrows 
showing the direction of the chain extension. Sequences derived from partial chemical modification (24) are shown with asterisks at the 5' end 
of the labeled chain. Restriction endonucleases: a, £coRI; b, HindlU; c, Sma I; d, Sal I; e, Bgl II; f, Alu I; g, Ava II; h, Hae III; i t Hha I; j, fft'nfl; 
k, Hpa II; 1, Mho II; m, Taq I. DNA fragments: 1, Taq I-3f; 2, Too I-7f; 3, Taq I-3s; 4, Alu I-4f; 5, Hae III-13/L; 6, Hae III-13/H; 7, tfmfl-ls; 8, 
Hinfl-ls; 9, Hinfl-lf; 10, Hae III-2f; 11, Hha I-3f; 12, M bo IM2s; 13, Taq MOs; 14, Taq I-9f; 15, Hpa II-4f; 16, Hintl-li; 17, Hmfl-3f; 18, HintlM; 
19, Hae IH-lf; 20, Mbo IM2f; 21, Taq MOf; 22, Taq I-9s; 23, Hpa II-4s; 24, Hinfi-ls; 25, HmdIII/BamHI-2; 26, Hmfl^s; 27, HinWHpa II-2; 
28, Hpa II-2f; 29, f/mdIII/BamHM; 30, Hpa II-6s; 31, Mbo II-5s; 32, Ava II-4s; 33, Sma I/£coRM/L; 34, Sma I/EcoRI-2/H; 35, Hae III-3f; 
36, Hpa II/Hinn-2; 37, Hpa II-2s; 38, Hpa IM3f; 39, Hpa II-6f; 40, Aua WHpa II-2; 41, Ava II-4f; 42, Hpa II-9f; 43, Sma I/EcoRI-2/H. The 
portions of the sequence covered by the plasmid DNA molecules in Fig. 1 are also shown. 
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FlG. 3. Autoradiographs of sequence gels from the partial chemical modification method applied to the single-stranded DNA fragment 
Hae IIMf (Fig. 2). The sequence read from the gel is complementary to the 16S RNA sequence and covers the region from positions 421 to 660. 
The positions of nucleotides in the 16S RNA sequence are indicated by the numbers. The gels (0.7 X 20 X 40 cm) were 12% polyacrylamide/0.6% 
bisacrylamide run 1, 3, 9, and 12 hr, respectively, at 1000 V (Left and Center) and 8% acrylamide/0.4% bisacrylamide run for 7 and 10 hr, re- 
spectively; at 1000 V (Right). 



Sequencing by the Partial Chemical Modification Method. 
For sequence determination according to the method of Maxam 
and Gilbert (24), the mixture of restriction fragments (20-40 
pmol) was treated with bacterial alkaline phosphatase (1 Mg/5 
/xg of DNA) for 1 hr at 37° , extracted four times with an equal 
volume of phenol saturated with 10 mM Tris-HCl, pH 8.0/1 
mM EDTA, and precipitated with ethanol. The fragments were 
then labeled at their 5' termini by using 5-10 units of polynu- 
cleotide kinase (P-L Biochemicals) and 0.5-2 mCi of pPjATP 
(1500-2000 Ci/mmol) synthesized as described (25). Excess 
ATP and phosphate were removed by passing the reaction 
mixture through a column of Sephadex G-75 in 2.5 mM Tris- 
HC1/1 mM NaCl, pH 7.4, poured in a 1-ml disposable pipet. 
The peak containing the labeled DNA was lyophilized and 
taken up in 50 pi of 0.02 M Tris borate, pH 8.3/0.5 mM EDTA, 
and the labeled restriction fragments were resolved by elec- 
trophoresis on an 8% polyacrylamide gel (0.5 X 20 X 40 cm) as 
described above. DNA fragments were located by radioauto- 
graphy and eluted as described above. Singly end-labeled DNA 
was obtained by strand separation on 8% acrylamide/0.13% 
bisacrylamide gels (0.5 X 20 X 40 cm) (26) after the DNA 
sample was dissolved in 0.3 M NaOH/1 mM EDTA/10% 
(vol/vol) glycerol/0.05% xylene cyanol/0.05% bromophenol 
blue or by recleavage of the double-stranded DNA with a sec- 
ond restriction endonuclease. 

Chemical treatment of the end-labeled DNA was performed 



essentially as described (24) except that a saturated solution of 
NaCl was used for cytidine suppression, 1 M instead of 0.5 M 
piperidine was used for the pyrimidine reactions, and piperi- 
dine was removed under reduced pressure at 60° instead of by 
lyophilization. Sequence gels were constructed and run essen- 
tially as described (27). 

RESULTS AND DISCUSSION 

Sequencing Strategy. The 16S RNA gene from the E. coli 
rmB cistron was conveniently isolated in two EcoRI fragments 
from the transducing phage Xri/ d 18 (17, 18). These fragments 
were cloned by insertion at the Eco RI site of ColEl plasmid, 
as shown in Fig. 1. Because of the difficulty in growing large 
amounts of the recombinant plasmid pER24, presumably due 
to the presence of phage genes in the plasmid, the small tffndlll 
fragment was excised from pER24 and reinserted into the ve- 
hicle pBR322 (19). Residues 1-80 and 64S-673 and the se- 
quence preceding the 5' terminus of 16S RNA were obtained 
from pER24, residues 81-647 were from pKK115, and residues 
674-1541 and the sequence following the 3' terminus were from 
pER18. Fig. 2 summarizes the positions of cleavage by the re- 
striction endonucleases used in this work and the fragments 
from which the sequences were derived. 

Most of the sequence was derived by the partial chemical 
modification method of Maxam and Gilbert (24). The sequence 
was detennined independendy two to six times at each position, 
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P AAAUU6AAGA6UUU6AUCAU66CUCA6AUUGAAC6CU66C6GCA66CCUA so 

ACACAUGCAAGUCGAACGGUAACAGGAAGAAGCUUgTuCIIUUGCUGACGA ioo 

GUGGCGGACGGGUGAGUAAUGUCUGGGAAACUGCCUGAUGGAGGGGGAUA iso 

ACUACUGGAAACGGUAGCUAAUACCGCAUAACGUCGCAAGACCAAAGAGG »o 

GGGACCUUCGGGCCUCUUGCCAUCGGAUGUGCCCAGAUGGGAUUAGCUAG 250 

UAGGUGGGGUAACGGCUCACCUAGGCGACGAUCCCUAGCUGGUCUGAGAG 300 

GAUGACCAGCCACACUGGAACUGAGACACGGUCCAGACUCCUACGGGAGG 350 

C A G C A G U G G g'g A A U A U U G C A C A A U G G G C G C A A G C C U G A U G C A G C C A U G C C 400 

GCGUGUAUGAAGAAGGCCUUCGGGUUGUAAAGUACUu"uCAGCGGGGAGGA 450 

AGGGAGUAAA'GUUAAU^CUUUGCUCAUUGACGUUACCCGCAGAAGAAGC 500 

C" t 

ACCGGCUAACUCCGUGCCAGCAGCCJGCGGUAAUACGGAGGGUGCAAGCGU 550 

UAAUCGGAAUUACUGGGCGUAAAGCGCACGCAGGCGGUUUGUUAAGUCAG 600 

AUGUGAAAUCCCCGGGCUCAACCUGGGAACUGCAUCUGAUACUGGCAAGC eso 

UUGAGUCUCGUAGAGGGGGGUAGAAUUCCAGGUGUAGCGGUGAAAUGCGU' 700 

A0AGAUCUGGAGGAAUACCGGUGGCGAAGGCGGCCCCCUGGACGAAGACU 750 

MCGCUCAGGUGCGAAAGCGUGGGGAGCAAACAGGAUUAGAUACCCUGGU eoo 

AGUCCACGCCGUAAACGAUGUCGACUUGGAGGUUGUGCCCUUGAGGCGUG eso 

GCUUCCGGAGCUAACGCGUUAAGUCGACCGCCUGGGGAGUACGGCCGCAA 900 

GGUUAAAACUCAAAUGAAUUGACGGGGGCCCGCACAAGCGGUGGAGCAUG 950 

UGGUUUAAUUCGA ufelC AACGCGAAGAACCUUACCUGGUCUUGACAUCCAC 1000 

GGAAGUUUUCAGAGAUGAGAAUGUGCCUUCGGGAACCGUGAGACAGGUGC ioso 

U 6 C A U G G C U G U C G U C A G C U C 6 U G U U 6 U G A A A U 6 U U 6 G G U U A A 6 U C C C G C A 1100 

ACGAGCGCAACCCUUAUCCUUUGUUGCCAGCGGUCCGGCCGGGAACUCAA nso 

AGGAGACUGCCAGUGAUAAACUGGAGGAAGGUGGGGAUGACGUCAAGUCA 1200 

UCAUG&CCCUUACGACCAGGGCUACACACGUGCUACAAUGGCGCAUACAA 1250 

A G A G A A G C G A C C U C G C G AG A G C A A G C G G A ff C C U C A U A A A G U G C G U C G U A G U 1300 

CCGGAUUGGAGUCUGCAACUCGACUCCAUGAAGUCGGAAUCGCUAGUAAU 1350 

CGUGGAUCAGAAUGCCACGGUGAAUACGUUCCCGGGCCUUGUACACACCG noo 

SC m C CGUlCACACCAUGGGAGUGGGUUGCAAAAGAAGUAGGUAGCUUAACCUU "so 

j 

CGGGAGGGCGCUUACCACUUUGUGAUUCAUGACUGGGGUGAAGUCG.UAAC 
AAGGUAACCGUAGGG G*^ CCUGCGGUUGGAUCACCUCCUU A 0H "4i 

FIG. 4. The nucleotide sequence of 16S rRNA from the rrnB cistron of E. coli. The RNA sequence was inferred from the DNA sequence. 
The ends of the 16S rRNA and positions of methylated nucleotides were identified by comparison with RNA sequence results (1, 6-10, 28-31, 
and unpublished). For comparison with the sequences reported by Ehresmann et al. (7-9), the positions of their lettered sections is given. In 
some cases (e.g., section I*) the assignment of these sections was necessarily approximate. 

and by sequencing both DNA strands, except for residues 1-80, the sequences reported by Woese and coworkers (11, 36). In the 
288-336, 652-673, and 1513-1541, which were sequenced on most recently published version of their sequences, Ehresmann 
only one strand. et al. (9) reported much closer agreement with the latter cata- 

Sequence Results. Examples of sequence gels are shown in log, but a few discrepancies remain. There are no differences 
Fig. 3. By use of thin gels (27) and other minor modifications, between the final catalog of Tl oligonucleotides as determined 
it was often possible to read sequences accurately to about 250 by the Woese group and those in the present sequence (Fig. 4), 
base pairs from the proximal end of a fragment. Fig. 4 shows with the apparent exception of two modified oligonucleotides, 
the complete nucleotide sequence of the 16S RNA as derived whose general forms are C-C-N-C-G (positions 524-528) and 
from the gene sequence. The total length of the chain is 1541 N-C-C-G (positions 1401-1404) (refs. 10 and 11; C. Woese, 
nucleotides, in close agreement with earlier estimates based on personal communication). The differences, however, are trivial 
chemical methods (9, 32) but somewhat lfess than values pre- in that they concern only the identification of the modified 
dieted from physicochemical methods (33-35). nucleotides itself. Woese and coworkers incorrectly assumed 

Comparison with Previous Sequence Results. Two con- that N in the pentanucleoude corresponds to ^Qn and therefore 
flicting catalogs of RNase Tl oligonucleotides have been pub- N in the tetranucleotide is 7 m G (C. Woese, personal communi- 
lished for E. coli 16S RNA (8, 36). In the course of determining cation). 

sequences around kethoxal-reactive sites in 16S RNA (6, 12), A serious source of discrepancy between the sequence pre- 

we have noted some discrepancies between our results and those sented here and that by Ehresmann et al. (9) is in the alignment 
of Ehresmann et al. (8), but our results are in agreement with of oligonucleotides. This has been previously noted in instances 
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in which kethoxal modification studies have provided ordering 
of adjacent oligonucleotides (6), in a recent study using rapid 
RNA sequencing techniques (13), and in other studies using 
DNA methods (14). A prominent example of this problem is the 
region 398-504 (Fig. 4). In the latest version of the sequence 
of Ehresmann et al. (9), 43 nucleotides originating from posi- 
tions 462-504 are inserted at position 432/433. 

The remaining differences between the sequence presented 
here and the previous one (9) concern small sections of the 
molecule, and many involve the insertion or deletion of mono- 
or dinucleotides. In all, we note discrepancies involving about 
200 nucleotides. A few of these differences might be due to 
sequence heterogeneity among the various rRNA cistrons, for 
which evidence has been presented (7-10, 34). Our sequence 
is in agreement with much of the recently published corrections 
in the region 1103-1165 and 1414-1488 reported by Ross and 
Brimacombe (13) on the basis of a newly developed rapid RNA 
sequence method. However, these authors have apparently 
deleted pyrimidines at positions 1161, 1466,' 1473, 1477, 1478, 
and 1480. Young and Steitz (14) used DNA sequence methods 
and reported sequences from the 5' and 3' ends of 16S RNA 
from rrnD and one other rRNA cistron, as yet unidentified. 
Their results for residues 1-18 and 1381-1541 are in complete 
agreement with our findings. 

Consequences of Sequence Changes for Secondary 
Structure Prediction. Previously, we noted that a majority of 
the sites of kethoxal modification of 16S RNA, which must be 
single stranded (37), have been assigned to double-stranded 
structures in the secondary structure model for 16S RNA pro- 
posed by Ehresmann et al. (8). Furthermore, some of the sites 
of nuclease attack observed under partial digestion conditions 
reported by these authors (7, 8) are inconsistent with their 
model. As anticipated, the sequence in Fig. 4 gives rise to a very 
different secondary structure prediction, in which these con- 
flicts are largely resolved. A secondary structure model based 
on this sequence as well as a discussion of the structural and 
functional implications of these findings will be presented 
elsewhere. 

Possible Coding Functions of 16S RNA. The use of rRNA 
as messenger by the cell has often been considered. Examination 
of the 16S rRNA sequence in Fig. 4 shows that most AUG or 
GUG triplets are followed by in-phase termination codons after 
a short interval. One exception is the AUG triplet at position 
1187, which has no in-phase terminator until position 1439, 
allowing for a translation of a sequence of 84 amino acids. 
Furthermore, preceding the initiator codon by a distance of 
nine nucleotides is the sequence G-G-A-G-G, which would 
allow a favorable base-paired interaction with the 3'-terminal 
sequence of 16S rRNA in the mRNA recognition mechanism 
proposed by Shine and Dalgamo (1). Whether this theoretically 
possible message is in fact translatable in vivo is open to ques- 
tion. We have searched the sequence for possible homology 
with known £. coli protein sequences, including the ribosomal 
proteins, without success. It is most likely that translation of this 
sequence is precluded by formation of stable secondary struc- 
ture of the RNA and by binding of ribosomal proteins. 
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PERSPECTIVE 

A revolution is occurring in biology: perhaps it is better 
characterized as a revolution within a revolution. I am, of 
course, referring to the impact that the increasingly rapid 
capacity to sequence nucleic acids is having on a science that 
has already been radically transformed by molecular ap- 
proaches and concepts. While the impact is currently great- 
est in genetics and applied areas such as medicine and 
biotechnology, its most profound and lasting effect will be on 
our perception of evolution and its relationship to the rest of 
biology. The cell is basically an historical document, and 
gaining the capacity to read it (by the sequencing of genes) 
cannot but drastically alter the way we look at all of biology. 
No discipline within biology will be more changed by this 
revolution than microbiology, for until the advent of molec- 
ular sequencing, bacterial evolution was not a subject that 
could be approached experimentally. 
. With any novel scientific departure it is important to 
understand the historical setting in which it arises— -the 
paradigm it will change. Old prejudices tend to inhibit, 
distort, or otherwise shape new ideas, and historical analysis 
helps to eliminate much of the negative impact of the status 
quo. SUch analysis is particularly important in the present 
instance since microbiologists do not deal with evolutionary 
considerations as a matter of course and so tend not to 
appreciate them. Therefore, I begin this discussion with a 



brief look at how the relationship between microbiology and 
evolution (i.e., the lack thereof) developed. 

A Fruitless Search and Its Consequences 

Microbiologists of the late 1800s and early 1900s were 
certainly as cognizant of evolutionary considerations as any 
biologists. They assigned as much importance to determin- 
ing the natural (evolutionary) relationships among bacteria 
as zoologists and botanists did to determining metazoan 
genealogies. From Beijerinck to Kluyver to van Niel, a main 
concern of the Dutch school, perhaps 4 the dominant force in 
microbiology in the first half of this century, had been these 
natural relationships. Arid it must have been the hope of 
someday knowing them that inspired the founders Of 
Bergey's Manual to adopt for bacteria the same classifica- 
tion system used to group animals and plants phylogeneti- 
cally. 

The search for a bacterial phylogeny, however, became 
mired in failure, generations of it. Animals and plants are 
rich in complex morphological detail, which served as the 
basis for their phylogenetic classification. Bacteria, on the 
other hand, have such simple morphologies that these are of 
no use in defining their phylogeny. Bacterial physiologies are 
more useful in this regard, but are still too limited; while 
shared physiological traits often correctly group bacterial 
species, relatives lacking the trait in question usually exist as 
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well (56). Some of the early microbiologists, realizing the 
pitfalls in attempting to classify bacteria by the then avail- 
able criteria, avoided the area, which served to deemphasize 
the role of evolutionary considerations in the development of 
microbiology. Those who did concern themselves with 
phylogenetic classification of bacteria created distorted, 
basically flawed schemes which confused rather than re- 
solved problems and ultimately discredited the whole at- 
tempt. (One of the delightful absurdities to emerge from this 
involuted system is the taxon Pseudomonas, perhaps the 
best known, most studied, and most pedagogical! y utilized 
"representative genus," which actually is a collection of at 
least five separate groups of bacteria [53], whose name 
derives from the Greek pseudes and monas, i.e., false unit!) 

The situation seems to have reached a watershed during 
the time of C. B. van Niel. Initially a leader in attempts to 
determine microbial phylogenies (210), van Niel, in apparent 
frustration, ultimately gave up on them, settling for a deter- 
minative classification system the basic purpose of which 
was to identify species (226). Van Niel was perhaps the last 
microbiologist to treat the matter of bacterial evolution 
seriously. 

Without a capacity to determine bacterial genealogies, 
considerations of bacterial evolution are mere cerebral ex- 
ercises. Realizing this, students found it difficult to become 
enthusiastic about the subject. There was no way microbi- 
ology textbooks could treat it seriously either. Many did not 
treat it at all! Moreover, the failure to determine evolution- 
ary relationships seemed to generate the feeling that it was 
not important to do so. Bacterial evolution was ail but 
forgotten. All that remained to represent this once vital and 
important area of microbiology was a formal and unappeal- 
ing bacterial taxonomy: one falsely authoritative in its bor- 
rowed use of the Linnaean (phylogenetic) classification 
system, stultifying in its liturgy, and caught up in classifica- 
tion for classification's sake. 

The result of all this was that microbiology worked from a 
paradigm that for all intents and purposes was devoid of 
evolutionary concepts. They played no role in its pedagogy; 
they had no influence on the design and interpretation of 
experiments; they were not a part of its value structure. 
Roger Stanier, one of the few microbiologists who main- 
tained any interest at all in bacterial evolution, captured the 
spirit of the times with this piquant proscription (written 
about 1970): "Evolutionary speculation constitutes a kind of 
metascience, which has the same intellectual fascination for 
some biologists that metaphysical speculation possessed for 
some mediaeval scholastics. It can be considered a relatively 
harmless habit, like eating peanuts, unless it assumes the 
form of an obsession; then it becomes a vice" (209). That 
microbiology had reduced evolutionary matters to the status 
of dalliance was indeed unfortunate, for much of what is 
important and interesting about evolution lay hidden in the 
microbial world. 

Fortunately, nucleic acid sequencing technology today 
makes bacterial phylogeny a tractable problem. In fact, all 
phylogenetic relationships can now be determined much 
more easily and in far more detail and depth than was ever 
dreamed possible (116, 174). Microbiology is consequently 
being inundated with sequence information, which accumu- 
lates so rapidly that the reading and entering of data are 
becoming major concerns, while the actual sequencing op- 
erations will soon cease to be rate-limiting factors. The data 
are in a form [pages of A's, Cs, G's, and T(U)*s] alien to 
most microbiologists, their analysis is arcane, and 
phylogenetic conclusions tend to be presented in a take-it- 



or-leave-it manner. It is understandable that some microbi- 
ologists distrust these conclusions. However, it is not per- 
missible to ignore them. Phylogenies derived from sequence 
analysis have to be accepted for what they minimally are: 
hypotheses, to be tested and either strengthened or rejected 
on the basis of other kinds of data (97). 

Microbiology is now at a point at which it has to ask why 
it should concern itself with bacterial evolution and what 
such concern would mean to its future. The emerging 
bacterial phylogeny cannot be viewed as having merely local 
impact, i.e., as being a revision of existing bacterial taxon- 
omy. At the very least the existing taxonomy will be totally 
rewritten by what is currently happening. Even that, how- 
ever, is only the tip of the iceberg. Phylogenetic perspective 
will affect the microbiology paradigm throughout. This is 
already apparent in the change in perspective accompanying 
the discovery of archaebacteria. Bacteria will no longer be 
conceptualized mainly in terms of their morphologies and 
biochemistries; their relationships to other bacteria will be 
central to the concept as well. Design and interpretation of 
experiments will be significantly changed. Microbial bio- 
chemistry will be conceptualized more in a comparative 
way. Medical microbiology will have a broadened perspec- 
tive. Phylogenetic considerations will increase the microbi- 
ologist's interest in microbial ecology and shape his ap- 
proach to it (157, 164). Perhaps the most significant change 
will be the altering of our perception of the relationship 
between procaryotes and eucaryotes and, therefore, of the 
position microbiology holds in relation to the other biological 
disciplines. 

The evolutionist, too, needs to concern himself with the 
effect that opening the "Pandora's box'* of bacterial evolu- 
tion is going to have, for although he is exquisitely aware of 
evolution as it is encountered among the metazoa, the world 
of forms and fossils, bacterial evolution is as alien to him as 
it is to the microbiologist. Determining microbial phylogeny 
is not simply the long awaited completion of the "Darwinian 
programme/* the extension of evolutionary study to all life 
on this planet. Rather than its providing the few missing 
pieces in the great puzzle of evolution, bacterial evolution in 
effect is the puzzle. It increases the current time span of 
evolutionary study by almost an order of magnitude. It holds 
the key to the origin of the eucaryotic cell. It shows the 
evolutionist an intimacy between the evolution of the planet 
and the life forms thereon that he has never before experi- 
enced and which, consequently, will lead to a close relation- 
ship between the geologist (not merely the paleontologist) 
and the evolutionist. It will redefine the classical question 
concerning the connection between evolutionary rate and 
the quality of the resulting change (the so-called tempo-mode 
problem) in new and more powerful terms (see below). Even 
the way we conceptualize selection, the roles of positive 
versus negative selection, may be changed. In other words, 
bacterial evolution will show us that, far from approaching 
the culmination of evolutionary study, where one refines 
existing concepts and fills in the details, we are only at its 
beginning. 

The impact of the study of bacterial evolution should be 
felt throughout all of biology, bringing about major shifts in 
emphasis. The nature of the universal ancestor (which, if it is 
given attention at all today, is seen as just another common 
ancestor of a group) will be recognized for the important 
biological problem that it is. A major effort will be mounted 
to define the ancestral gene families. And the historical 
dimension will become a significant, useful part of the study 
of macromolecular structure. Because its conceptual base 
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now rests outside of biology per se (in physics and chemis- 
try), biology's interests, its thrust, have tended increasingly 
to be defined by external factors, many even extrascientific. 
Biology has become very much an "other directed" disci- 
pline. What a renewed and broadened interest in evolution- 
ary questions can and will do is to restore to biology an 
internally defined sense of direction. 

Some restructuring at a metaphysical level is even a 
possibility. Biology's base, scientific materialism (that highly 
reductionist, highly mechanistic picture borrowed wholesale 
from the 19th century physicist), is a world view to which 
physicists since Einstein, Bohr, and Schrddinger can no 
longer subscribe. If there by anything to Whitehead's con- 
cept of evolution as basic process (236), the interest in 
evolution generated by its study in bacteria (and unicellular 
eucaryotes) may push biology in this direction, toward a 
process-oriented outlook, an attitude that processes (evolu- 
tion, development, mind) somehow underlie genes, cells, 
brains, etc., not the reverse. 

Whatever else it is or whatever impact it may have, the 
study of bacterial evolutionary relationships is central to the 
historical account of life on this planet. We may lay no claim 
to a comprehensive understanding of biology until we know 
this history, at least in its outline. And this is the perspective 
from which the present review is written. 

Three Ideas That Shape Our Concept of Bacterial Evolution 

Procaryote-eucaryote dichotomy. The way we look at bac- 
terial evolution is in essence shaped by a few picturesque, 
strongly held notions. One has to do with the place of 
bacteria in the spectrum of living systems; the other two 
concern our picture of how life began. 

The prescientific distinction animal-vegetable-mineral is 
the starting point$for our perception of the relationships 
among living things. Initially, every living thing was thought 
to be either a plant or an animal. The invention of the 
microscope, however, revealed a world of unicellular crea- 
tures, which because of their enormous and unusual variety 
caused us to wonder whether they were just very small 
animals or plants; there seemed to be another basic distinc- 
tion, between macroscopic and microscopic forms, between 
multicellular and unicellular life. Haeckel's classic phytog- 
eny, reproduced in Fig. 1, represents an amalgam of these 
two early views (76), in having three basic categories of 
living systems: plants, animals, and microorganisms 
(protists). This classical approach to phylogeny, based es- 
sentially on characteristics of the whole organism, has been 
refined in modem times to a five-kingdom scheme (139, 240, 
241). However, such a taxonomy is not phylogenetically 
valid (see below). 

It is intuitively evident that certain of an organism's 
characteristics, especially cellular attributes, are more es- 
sential than others. In the 1930s, E. Chatton (25) sought to 
construct a universal phylogeny on this principle by dividing 
the living world into two main groups, eucaryotes and 
procaryotes, on the basis of whether or not they possessed a 
true nucleus, i.e., one circumscribed by a nuclear mem- 
brane. (He also involved a few other intracellular structures 
found only in eucaryotes, such as the mitochondrion, to 
bolster the case [25].) Chatton's approach had clear virtues, 
and once its two basic categories were defined in detail, 
through electron microscopic studies and various molecular 
characterizations, the procaryote-eucaryote dichotomy be- 
came firmly (dogmatically) established as the primary 
phylogenetic distinction (138, 152, 209, 211). 



However, the original definition of the procaryote carried 
an implication that no biologist recognized at the time, an 
implication that had profound consequences. Procaryotes 
were initially defined in a purely negative sense: they did not 
have this or that feature seen in eucaryotic cells. There was 
no logical reason to assume, therefore, that all procaryotes 
(all cells that were not eucaryotes, that is) were specifically 
related to one another. Yet, this is precisely what happened; 
"procaryote" was taken from the start to be a phylogeneti- 
cally coherent taxon (209). 

Over the years the definition of the procaryote (vis a vis 
the eucaryote) expanded from the initial negative one, based 
solely upon noncomparable characteristics, to a positive 
one, based upon comparable, molecular properties. In that 
process one procaryote, Escherichia coli, was assumed to 
represent all. Now we see the error in this never-tested 
assumption. Its unquestioned acceptance probably delayed 
the discovery of archaebacteria by well over a decade. 

Oparin Ocean scenario. The old notion that the living 
world is somehow distinct from, unconnected to, the nonliv- 
ing world (an idea that reflects creation myths, the mind- 
matter dichotomy, and other such things) gave rise to the 
panspermia notion. To counter both this idea and creation- 
ism, biologists were obliged to account for the origin of life 
in physical processes occurring on this planet. The first 
reasonably comprehensive attempt to do this was made by 
A. I. Oparin in the 1920s (159); a similar, but significantly 
different, proposal was made somewhat later by J. B. S. 
Haldane (77); and a slightly modified amalgam of the two is 
the only account of life's beginnings accepted by biologists 
today (104, 149, 160). The Oparin Ocean scenario, as it has 
been called, is frankly a "Just So Story." It is a vague and so 
not too useful hypothesis in its own right, and to the extent 
that biologists treat it as dogma, its scientific effectiveness is 
further diminished (249). 

According to the current Oparin Ocean scenario, the 
primitive oceans became the ultimate repository for the great 
variety of chemicals and biochemicals thought to have been 
produced in the primitive anaerobic (at least, nonoxidizing) 
atmosphere through the action of ultraviolet light and elec- 
trical discharge upon water vapor, carbon dioxide, nitrogen, 
and other gases. In this way the primitive ocean became a 
"soup" of energy-rich biochemicals (104, 149, 160). Inter- 
actions among these produced ever more complicated struc- 
tures, which eventually (somehow) turned into complex 
living (self-replicating) cellular entities (160). Since these 
earliest living systems arose in a bath of nutrients, they had 
no need to synthesize amino acids, nucleotides, and other 
products of intermediary metabolism and so did not develop 
such capacities. In other words, the first organisms were 
extreme heterotrophs, having neither photosynthetic nor 
autotrophic capabilities (160). They were in essence sinks for 
the chemical energy stored in the oceans (249). Only when 
these early cells began to exhaust their oceanic supply of 
nutrients did a need arise for intermediary metabolism, 
autotrophic capacity, and the ability to use light as an energy 
source, and only then did these features evolve, deus ex 
machina (88, 160). 

This scenario has been interpreted by biologists to mean 
that the first organisms were (anaerobic) heterotrophic 
procaryotes, which later spawned photosynthetic and 
autotrophic sublines (and later still, procaryotes capable of 
aerobic metabolism). Textbooks customarily derive all living 
forms from an ancestral anaerobic fermentative heterotroph, 
taken to be some Clostridium or streptococcus (15, 139). 
Eucaryotes are generally brought into this picture through a 
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FIG. 1. Haeckel's phylogenetic tree of 1866 (76). 



225 



226 WOESE 



Microbiol. Rev. 



subline of wall-less (anaerobic) procaryotes that gained the 
capacity for endocytosis (138, 139). By ingesting other 
procaryotes that gave it additional metabolic capabilities 
(and molecular structures) and that eventually degenerated 
into organelles (contributing genes to the nucleus in the 
process), this primitive host cell, this urcaryote (256), be- 
came the eucaryotic cell. 

Once again implicit assumptions have misled us, in this 
case the connotation of the prefix "pro-" in procaryote. 
Procaryotes had to precede e weary otes; procaryotes are ipso 
facto older, simpler, and more primitive than eucaryotes and 
therefore gave rise to them. This is perhaps the main, if not 
the only, reason why the vast majority of biologists perceive 
the first organisms as procaryotic. 

Darwin's warm little pond. A final, related element in the 
prejudices shaping our view of bacterial origins is Darwin's 
"warm little pond" image (31), which I am sure Darwin 
never intended to be a prescription for life's beginnings. 
(Darwin understood that the subject belonged to the future 
and probably intended more to dismiss it with this casual 
remark [31] than to give his successors a guiding principle.) 
Nevertheless, we are now stuck with this image of life's 
beginnings and have to cope with it as we do the other 
conceptual baggage we have inherited. Do microbiologists 
perhaps view thermophilic bacteria as adaptations from 
mesophilic species for this reason? "Warm" is an anthro- 
pocentrism. The setting in which bacteria arose may well 
have been warm, but it was not the hospitable warmth 
implicit in the pond Darwin pictured. 

The collection of images associated with the procaryote- 
eucaryote dichotomy, the Oparin Ocean scenario, and, to a 
lesser extent, Darwin's warm little pond form the starting 
point and dictate the direction of our thinking about bacterial 
evolution. The basic flaws in all of them will become even 
more apparent as we proceed. It would be better, if that were 
possible, to forget about the lot and approach bacterial 
evolution with a clean slate. Molecular phylogenetic studies 
of bacteria are going to tell us a great deal, especially if our 
thinking is unfettered by old anthropocentric notions. 

MEASUREMENT OF BACTERIAL PHYLOGENETIC 
RELATIONSHIPS 

Three-Dimensional and One-Dimensional Characters 

We intuitively recognize that complex characteristics 
(phenotypic patterns) are good indicators of relationship, 
i.e., and that a sufficiently complex pattern (character) is 
unlikely to have evolved more than once. However, our 
assessment that two or more organisms have the same or 
similar complex characteristics is by no means fail-safe. The 
judgment ''similar" is too often subjective; what appears 
complex to our eye may not be so in the dynamics of the 
organism, and something that the biologist imagines as 
difficult to evolve may in reality be relatively simple (due to 
constraints on the system that he has not recognized, for 
example). Nowhere are our failings in this regard more 
evident than in the attempts to classify bacteria. 

The sequencing of proteins and nucleic acids provides a 
new and more powerful approach to measuring evolutionary 
relationships and a new way of looking at them, in terms of 
the "evolutionary clock" (243). Genotypic information, i.e., 
sequence information, is superior in two main ways to 
phenotypic information, the classical basis for relating and 
classifying organisms: sequence information is (i) more 
readily, reliably, and precisely interpreted and (ii) innately 



more informative of evolutionary relationships than pheno- 
typic information is. 

Unlike three-dimensional phenotypic patterns, a sequence 
pattern is one dimensional. One-dimensional patterns can be 
measured in simple ways, in terms of simple relationships. 
The elements of a sequence, nucleotides or amino acids, are 
restricted in number and well defined (quantized). The 
subjectivity that goes into the judgments "same," "simi- 
lar," etc., at the phenotypic (three-dimensional) level is 
replaced by simple, more objective judgments and mathe- 
matically defined relationships in the world of sequences. 

The evolutionary clock. The introduction of genetics into 
our model of the evolutionary process in the early part of this 
century was a major advance in that it let us understand 
evolution's "motor," the source of the variation upon which 
selection works. The concept of the evolutionary clock 
furthers this understanding; it shows us the relationship 
between this motor (i.e., genotypic change) and what we 
classically call evolution, the changes in phenotype. At the 
level of the genotype, change constantly occurs. However, 
most of it is of a nature that it is not acted upon by selection 
(105, 106). It therefore, becomes fixed randomly in time, 
making its characterization as "clocklike" in occurrence 
appropriate. In other words, evolution has a tempo that is 
quasi-independent of its mode (the selected changes occur- 
ring in the phenotype). An analogy to a car and its motor is 
apt: a car does not go unless its motor is running, but the 
motor can run without the car moving. 

Cytochrome c evolution provides a good example of the 
evolutionary clock. An enormous number of different ver- 
sions of this sequence all appear to be equivalent function- 
ally (44). (Formally speaking, the mapping from genotype to 
phenotype [upon which selection acts] is degenerate.) A 
change from one such version to another would then occur 
randomly, independent of selection; the probability of its 
occurrence would only reflect a lineage's mutation rate (105, 
106). Since the number of possible functional configurations 
for a given gene is enormous by any standards, similarity at 
the genotypic level (i.e., extensive sequence homology) can 
never reflect convergent evolution. Consequently, cyto- 
chrome c sequence comparisons have been used very suc- 
cessfully to time key events in eucaryotic evolution (in 
sequence distance terms) and to determine molecular gene- 
alogies. Phylogenetic trees based upon cytochrome c, or 
similar molecular chronometers, represent significant im- 
provements over their classical counterparts based upon 
phenotypic comparisons (49). 

A missed opportunity. The now classic paper of 
Zuckerkandl and Pauling, in 1965, effectively launched biol- 
ogy into the world of molecular chronometers (283). Al- 
though they may not have been the first to recognize that 
sequence comparisons could be used to define phylogenetic 
relationships (protein-sequencing technology had been in 
existence for about a decade by then), Zuckerkandl and 
Pauling were the first to put the case in a well-defined, 
scientifically effective way. This was precisely what the 
microbiologist needed to resolve the problems of the natural 
relationships among bacteria. However, microbiologists did 
not rush to utilize the new approach; remember, they had by 
then come to see the problem as unimportant. Nevertheless, 
a peripheral sort of awareness existed; the new molecular 
techniques were perceived by some as useful in standard 
bacterial classification, and a number of small-scale efforts 
were mounted to improve classification by various molecular 
approaches. Genetic characterization, deoxyribonucleic 
acid (DNA) base ratios, nucleic acid hybridization studies, 
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cell wall analyses, and (a little) protein sequencing began to 
reveal phylogenetically valid groupings (4, 140, 183, 188). 

These early molecular approaches, though useful, were 
not powerful enough to reveal the higher bacterial taxa, and 
in any case conventional wisdom did not perceive doing so 
as important. While gram-positive cell walls exhibited inter- 
esting and informative variety in their composition, the 
gram-negative ones were too uniform to be of much use in 
defining taxa (183). Nucleic acid hybridization work, i.e., 
DNA/DNA studies, were (necessarily) confined to relation- 
ships within genera (185, 207). When DNA/ribosomal 
ribonucleic acid (rRNA) hybridization studies were insti- 
tuted, they were used only to revise existing local taxonomic 
structure (32, 33). Even protein sequence comparisons pro- 
vided no insights, except in one particular group of purple 
bacteria (4, 148). 

Nature of Molecular Chronometers 

A molecule whose sequence changes randomly in time can 
be considered a chronometer. The amount of sequence 
change it accumulates (formally a distance) is the product of 
a rate (at which mutations become fixed) x a time (over 
which the changes have occurred). The biologist cannot 
measure this change, however, by comparison of some 
original to some final state, since the original state (ancestral 
pattern) is not accessible to him. Instead, he uses the fact 
that two (or more) versions of a given sequence that occur in 
extant representatives of two (or more) lineages have ulti- 
mately come from the same common ancestral pattern, and 
so measures the sequence difference between the two (or 
more) extant versions, which is roughly twice the amount of 
change that each lineage has undergone (assuming compara- 
ble rates of change in each) since they last shared a common 
ancestor. 

All sequences are not of equal value in determining 
phylogenetic relationships. To be a useful chronometer, a 
molecule has to meet certain specifications as to (i) clocklike 
behavior (changes in its sequence have to occur as randomly 
as possible), (ii) range (rates of change have to be commen- 
surate with the spectrum of evolutionary distances being 
measured), and (iii) size (the molecule has to be large enough 
to provide an adequate amount of information and to be a 
" smooth-running" chronometer (explained below). 

Clocklike behavior. A molecular chronometer should mea- 
sure, should be representative of, the overall rate of evolu- 
tionary change in a line of descent. One might think that the 
best chronometer would then be a genetic segment upon 
which there are no selective constraints. Its changes would 
occur randomly along the length of the segment, occur in a 
quasi-clocklike fashion, become fixed at a rate equal to the 
lineage's mutation rate (105, 106), and be easy to interpret. 
However, such sequences are of little value for phylogenetic 
measurement, because they generally do not meet the sec- 
ond requirement; their rates are so rapid that they cover only 
very restricted phylogenetic ranges (38). Such sequences are 
evolutionary stopwatches; they measure only the short-term 
evolutionary events. The third (i.e., degenerate) codon po- 
sitions in structural genes are often used in this capacity (20, 
128, 243). 

The more useful molecules for phylogenetic measurement 
all represent highly constrained functions. Some sequences 
of this type change slowly enough to span the full evolution- 
ary spectrum. Unfortunately, what makes them useful as 
chronometers also makes them problematical. Strict clock- 
like behavior is usually hard to find, i.e., identify. Unless 



functional constraints remain strictly constant over the evo- 
lutionary range being covered, nonrandom (selected) se- 
quence changes accumulate, over and above the randomly 
introduced ones, and artificially increase phylogenetic dis- 
tances between organisms, which usually leads to improper 
determinations of branching orders. Cytochrome c provides 
an example. In the a subdivision of the purple bacteria 
(defined below) the molecule changes in size, from "me- 
dium" to "large" (37), reflecting some unknown and subtle 
functional change. This size change has brought with it 
additional nonrandom sequence changes that appear to 
distort the phylogenetic determination somewhat (4). A 
similar situation can be seen in the phylogeny of certain 
Bacillus species constructed from 5S rRNA sequence com- 
parisons (87). 

A second problem with highly constrained chronometers 
is the extremely different rates at which the various positions 
in a sequence tend to change. This in itself is not a problem; 
the hands of a clock move at very different rates. However, 
analysis of the data becomes difficult at this point (see 
below). 

Phylogenetic range. The world of bacterial evolution is 
vast in comparison to that of the eucaryotes with which we 
are familiar, i.e.; the metazoa. A billion years is a relatively 
short time in bacterial evolution. Therefore, the range of 
chronometers used to measure phylogenetic relationships 
among bacteria needs to be considerably greater than what is 
optimal for metazoa. Cytochrome c is an excellent chronom- 
eter for measuring much of eucaryote phylogeny (49). 
Among the eubacteria, its effective range is restricted to the 
subdivision level; it orders the a-purple bacteria (see below), 
but does not relate these accurately to any other subdivision 
of the purple bacteria (4, 148). 

Size and accuracy. In addition to the obvious need for large 
size in a chronometer (i.e., good statistics), there is a more 
subtle one. Size per se is probably not what is important. It 
is that the molecule consists of a fairly large number of 
loosely coupled "domains" (functional units), regions that 
are somewhat independent of one another in an evolutionary 
sense (250). In this case, nonrandom changes affecting one of 
the units will not appreciably affect the others; therefore, 
when one part of the chronometer becomes drastically 
altered by introduction of selected changes (i.e., gives a 
distorted reading), the other parts remain practically unaf- 
fected. The more units of this kind a chronometer contains, 
the less sensitive are its measurements of evolutionary 
distances to nonrandom changes in one of them, i.e., the 
"smoother" the chronometer runs. This is a major differ- 
ence between 5S rRNA, for example, and the large rRNAs 
(250). 

rRNAs, the Ultimate Molecular Chronometers 
Why they are so good. rRNAs. are at present the most 
useful and most used of the molecular chronometers. They 
show a high degree of functionally constancy, which assures 
relatively good ciocklike behavior (250). They occur in all 
organisms, and different positions in their sequences change 
at very different rates, allowing most phylogenetic relation- 
ships (including the most distant) to be measured, which 
makes their range all-encompassing. Their sizes are large 
and they consist of many domains. There are about 50 helical 
stalks in the 16S rRNA secondary structure and roughly 
twice that number in the 23S rRNA (75, 155), which makes 
them accurate chronometers on two counts. 

Perhaps the most compelling reason for using rRNAs as 
chronometers is that they can be sequenced directly (and, 
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therefore, rapidly) by means of the enzyme reverse tran- 
scriptase (116, 174). This distinguishes them from all other 
potential chronometers in the cell except for a few of the 
smaller RNA species (which do not have as good chrono- 
metric properties, and in some cases cannot be isolated with 
the ease with which rRNAs can). It is reasonable for a 
properly equipped laboratory in the future to sequence on 
the order of 100 16S rRNAs per year. 

Oligonucleotide cataloging. Until several years ago it was 
not feasible to determine complete rRNA sequences. So 
more than a decade was spent characterizing them in terms 
of partial sequences, by the so-called oligonucleotide cata- 
loging method. Short oligonucleotides, of lengths up to 20 or 
so bases, are produced by digestion of 16S rRNAs with 
ribonuclease T x (which cleaves specifically at G residues) 
(55); a collection of these sequence fragments from a given 
rRNA constitutes an oligonucleotide catalog: a detailed, 
complex pattern characteristic of a given bacterial species 
(55). Comparisons among these catalogs permits phylo- 
genetic groupings to be identified at various taxonomic 
levels, including the highest (55, 257). 

Oligonucleotide catalog data are usually analyzed in terms 
of binary association coefficients, so-called Sab values (55), 
defined as the ratio of twice the sum of bases in oligonucle- 
otides (length greater than five) common to two catalogs A 
and B, to the sum of all bases (in oligonucleotides of length 
greater than five) in the two catalogs (55). The relationship 
between Sab value and percentage sequence similarity can- 
not be theoretically derived (because the relative rates at 
which individual positions change are not predictable a 
priori)^ A plot of percent similarities and corresponding Sab 
values for a collection of 16S rRNA sequences is shown in 
Fig. 2. Sab is seen to vary approximately as the fifth to sixth 
power of percent similarity. The relationship between the 
two measures is clearly not a precise one, especially below 
Sab values of about 0.40. 

While the cataloging method sufficed to define most of the 
major bacterial phyla, it generally failed to resolve the 
branching orders among them or among their subdivisions. 
(Such distinctions are never easily made, as evidenced by 
the fact that the animal phyla have been known for a 
century, but the order of their branching from one another 
has yet to be determined.) The cataloging approach also ran 
into difficulties over the branching order of rapidly evolving 
lines of descent, again a perennial problem. Full sequencing 
of 16S rRNA has now replaced the earlier oligonucleotide 
cataloging approach, a development that greatly increases 
the resolving power of the rRNA chronometer. 



Analysis of Sequence Alignments 

Given a sequence alignment, which for rRNAs can be 
constructed in a straightforward empirical manner (75, 260), 
the question becomes how to analyze it, i.e., how to extract 
th e most phylo genetically us e ful info rmation. ' At^resent* 
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Distance matrix methods. Distance matrix methods (47, 49, 
157) utilize only the sequence distances between pairs of 
sequences, i.e., the fraction of positions in which the two 
sequences differ. This distance is actually an underestimate 
of the true evolutionary distance between sequences; al- 
though most of the differences between two sequences 
reflect single mutational events at any given position in a 



sequence alignment (when the chronometer is working in its 
effective range, that is), some of them represent multiple 
events. Were all lineages and all positions in a sequence 
changing at the same rate, then correction for this effect, 
conversion of sequence distances to evolutionary distances, 
would be a relatively simple matter (99). Unfortunately, this 
is not the case, either for different lineages or for different 
positions in the sequence, and the proper correction for 
multiple changes remains a major problem in tree construc- 
tion. 

Distance matrix treeing assumes that evolutionary dis- 
tances conform to a tree topology. To use a simple example, 
let AB, AC, AD, BC, BD, and CD be the six determined 
evolutionary distances among four sequences, A, B, C, and 
D. If the three pairwise sums AB + CD, AC + BD, and AD 
+ BC meet the condition that two of them are equal and 
greater than or equal to the third, a condition that can be 
understood by reference to Fig. 3, then the data fit a tree 
topology. When this condition is met even approximately, 
evolutionary distances can be used to reconstruct phylog- 
enies with reasonable accuracy (36, 47, 49, 157). 

Given a matrix of (corrected) evolutionary distances for a 
sequence alignment, one in principle examines all possible 
phylogenetic trees (branching orders), treating branch 
lengths as adjustable parameters, and declares the one that 
best fits the data (by a least-squares analysis) to be the 
"correct" tree (47, 49, 157). This, of course, is not how 
things are actually done, for the number of all possible 
branching orders rapidly becomes computationally unman- 
ageable as the number of sequences in the alignment reaches 
even a moderate number. There are many approximations, 
many competing algorithms for giving the "best" tree in a 
reasonable amount of time (47, 49, 157). 

Maximum parsimony analysis. Unlike sequence distance 
matrix analysis, maximum parsimony analysis does not 
reduce the differences among sequences to a single number, 
a distance; it treats the positions individually (47, 48). its* 
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branch in the tree, each segment between branching points, 
is defined by the specific changes that occur [in some 
ancestral sequence] on that branch.) As with distance matrix 
treeing, one in principle looks at all possible tree branching 
arrangements and chooses the most parsimonious one (47, 
48). Also, as with distance matrix treeing, the problem of 
rinding the correct tree can be computationally intense, even 
more so in this case, and much time and effort have gone into 
devising computer algorithms that do this as efficiently as 
possible (46-48). 

Cluster analysis. Cluster analysis, the third major method 
of analyzing sequence data, rajjtrosiseg ue^^ 
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The main difficulty in all analyses of sequence data lies in 
the fact that different lineages and different positions in a 
sequence can evolve at significantly different rates. Making 
distance corrections on the assumption that all positions in a 
sequence change at the same rate (65 , 99, 157) underesti- 
mates the correction needed. Parsimony analysis tends not 
to position rapidly evolving lineages correctly and is con- 
fused by rapidly changing positions, perhaps more so than is 
distance treeing (157). Cluster analysis is especially sensitive 
to these problems; rapidly evolving lineages are as a rule 
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FIG. 2. Plot of percent sequence similarity versus binary association coefficient (Sab value [55]) ( for a representative sampling of 
eubacterial and archaebacterial sequences (unpublished analysis). The two theoretical curves are A* (upper curve) and X 6 (lower curve), 
where X ~ percent similarity. Symbols: O, values for pairs of eubacterial sequences; □, values for eubacteria with archaebacteria; ▲, values 
for E. colt with either eubacteria or archaebacteria. 
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FIG. 3. Unrooted tree for four species, A, B, C, and D, illustrat- 
ing the relationship that must hold among the six evolutionary 
distances, AB = a + b, AC = a + x + c, AD = a + x + d, BC = 
b + x + c, BD = b + x + d. and CD = c + d, for them to fit a tree 
topology (see text). 

positioned too deeply in trees so constructed. In other 
words, analyses of sequence data today are far from optimal. 

The needed improvements in analysis will not come 
primarily from better theoretical treatments or more efficient 
algorithms per se. Improvement will be basically the result 
of empirical approaches. Given a large enough sequence 
data base, it will become possible to describe the pattern 
(rate and kind) of change at given positions in a (rRNA) 
molecule and design specific analyses based upon this de- 
scription, analyses that correct more accurately for multiple 
changes at sites and utilize only those positions in the 
molecule appropriate to the phylogenetic range being mea- 
sured. (The optimal method will not consider the second 
hand when timing the seasons.) 

Consideration of the detailed chronometric structure of 
rRNA will be postponed until the reader has some familiarity 
with bacterial phylogeny. However, the reader who already 
has this familiarity may wish to read that discussion (which 
begins on page 253) at this point. 

DO BACTERIA HAVE A GENEALOGY AND A 
MEANINGFUL TAXONOMY? 

In classifying bacteria microbiologists make two implicit 
assumptions: (i) that bacteria have a phylogeny, and (ii) that 
the taxonomic system that works well for the metazoa is 
actually applicable to, i.e., meaningful in, the microbial 
world. These two points require explication and discussion, 
for they are far from self-evident. 

In questioning the. first assumption, one at the very least 
questions approaches to measuring bacterial genealogies and 
beyond that whether in principle the bacterium as a whole 
has a genealogy, a unique history. These questions are raised 
by the existence of lateral (interspecies) transfer of genetic 
information among bacteria (23, 177). A given gene (or set of 
genes), say for nitrogen fixation in Azotobacter sp., might 
have been evolved in an organism not immediately related to 
Azotobacter and have been acquired by that organism 
through p las mid transfer or some similar process. When 
used as molecular chronometers such genes would not yield 
the correct genealogy for Azotobacter. In the extreme, 
interspecies exchanges of genes could be so rampant, so 
broadspread, that a bacterium would not actually have a 
history in its own right; it would be an evolutionary chimera, 
a collection of genes (or gene clusters), each with its own 
history. 



Fortunately the matter is experimentally decidable. Were 
an organism an evolutionary chimera, then its various chro- 
nometers would yield different, conflicting phylogenies. A 
limited test of the possibility can be made for the a subdivi- 
sion of the purple bacteria, for which a number of species 
have been characterized by both rRNA catalogs and cyto- 
chrome c sequences. Phylogenetic trees derived from the 
two molecules have nearly the same topology, strongly 
suggesting that neither chronometer has been involved in 
interspecies gene transfer (258). Although more extensive 
testing of the lateral transfer notion is highly desirable, it is 
now relatively safe to assume that bacteria do in principle 
have unique, characteristic evolutionary histories and that at 
least some of the cell's chronometers record them. (What is 
not known is the fraction of the functions in a bacterial cell 
that are subject to interspecies gene transfer, and which ones 
these are.) 

Given that a bacterial genealogy exists, the question (the 
second assumption above) then becomes whether such a 
cladogram can be divided into zones (into taxa) that are 
naturally, as opposed to artificially, defined and, if so, 
whether the groupings arrange themselves naturally into 
simple hierarchical structures (into taxonomic levels). 

The metazoan (Linnaean) classification system, though 
still imperfectly implemented, is as useful as it is because 
metazoa (chiefly animals) intrinsically group into (naturally 
defined) categories. The metazoan kingdoms and the animal 
phyla (192), for example, are readily differentiable group- 
ings. And, although fine points are debated, a metazoan 
species is also well defined, largely because of the con- 
straints imposed by mating (144). A bacterial species is 
certainly far more problematical a concept than a metazoan 
species (8, 29). In the present context, however, our concern 
is essentially with the higher taxa: whether or not (within a 
bacterial urkingdom) these are somehow naturally defined or 
are mere artificial constructs. 

There is no compelling evidence to suggest that the 
bacteria fall into naturally defined taxa. In fact, existing 
evidence might even suggest the contrary. Few of the 
(extensively investigated) bacterial phyla presented below 
can be defined by phenotypic properties common to all 
members of the group. For example, the gram-positive 
bacteria defined by cell wall structure form a clade, but this 
clade also includes bacteria that do not have gram-positive 
walls. Although the purple bacterial group is named for the 
particular type of photosynthesis done by some of its mem- 
bers, the photosynthetic pigment does not define the group 
as a whole, which also includes many nonphotosynthetic 
species. 

Nevertheless, I feel that ultimately bacteria will be shown 
to fall into naturally defined taxa. One reason this is not 
obvious at present may be that various bacterial groups have 
been studied from different perspectives: what we know to 
be characteristic of one may never have been looked for in 
another. This fact alone could explain some of the apparent 
lack of phenotypic resemblance among genealogically clus- 
tered species. Another reason is that the microbiologist has 
never before had phylogenetically defined groupings that he 
could count on to direct his search for phenotypically 
unifying characters. Recent studies utilizing such an ap- 
proach (97) appear promising. 

The main reason for thinking that bacterial taxa are 
naturally defined is that the characteristics of the rRNA 
chronometer (discussed below) strongly suggest this. Under 
certain circumstances rRNAs will accumulate unusual se- 
quence changes, ones that normally occur with a negligible 
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frequency. These unusual changes seem to accompany the 
formation of various major branches on the tree, which 
correspond to major shifts in bacterial phenotype. The 
reason for this lies in the tempo-mode relationship. Thus the 
rRNA chronometer may provide a means of defining natural 
bacterial groupings that is purely genotypic, independent of 
any phenotypic definition thereof. 



THE UNIVERSAL PHYLOGENETIC TREE 

Although we evolutionists still have much to learn a 
century after DarwhYs death, a major milestone has recently 
been reached. Molecular phylogenetic approaches Let us for 
the first time see the full extent of the tree that encompasses 
all extant life (72, 263); see Fig. 4. As yet this tree must be 
drawn in an unrooted form, because the crucial question of 
the point in its structure that corresponds to the Universal 
Ancestor, the point from which all extant life ultimately 
emanates, remains unanswered. 

Perhaps the most striking characteristic of the universal 
tree is the distinctness of the primary kingdoms, the large 
sequence distances that separate one kingdom from another. 
Figure 5, a plot of percent sequence similarities for an 



extensive collection of procaryotic 16S rRNAs, demon- 
strates this in graphic form; every eubacterial sequence is far 
closer to every other eubacterial sequence than to any 
archaebacterial sequence, and vice versa. (The interkingdom 
distances in Fig. 4 are even best considered lower bounds, 
for they are large enough that they could well have been 
underestimated.) 

The extent of sequence distance that separates the pri- 
mary kingdoms is reflected in the degree of phenotypic 
difference among them. It has long been obvious that 
eucaryotes are quite distinct from procaryotes (i.e., 
eubacteria); the two differ in general cellular organization, in 
genome structure, in control and expression of genetic 
information, in the structure of the translation apparatus, 
and in the details of numerous molecular functions. In the 
last decade we have found the same to be true for the 
archaebacteria. They too have unique molecular architec- 
ture, cellular organization, genome structure, etc. (see as- 
sorted chapters in references 102 and 270). The degree of 
genotypic and phenotypic separation among the primary 
kingdoms argues that the ancestor they all shared was a 
special sort of entity (251), whose nature will be discussed 
below. 

Each of the primary kingdoms has its particular form of 
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rRNA. Secondary structures for the three types of 16S-like 
rRNAs can be seen in Fig. 6. While there are general 
resemblances among them, there are also characteristic 
differences in structural detail (75). 

Strong rRNA sequence signatures, i.e., positions in the 
molecule that have a highly conserved or invariant compo- 
sition in one kingdom, but a different (highly conserved) 
composition in one or both of the others, also define and 
distinguish the three urkingdoms. Table 1 is a signature 
distinguishing archaebacteria from eubacteria; the location 
of these positions in relation to the molecules' secondary 
structure can be seen in Fig. 7. Though not presented in this 
review, equally pronounced signatures exist for eucaryotes 
(75, 260). 

Archaebacteria can resemble either eubacteria or eucary- 
otes (or neither), depending upon what phenotypic charac- 
ters are considered. For example, the 16S rRNAs of the two 
procaryotic kingdoms are relatively close in structure (75), 
whereas 7S RNA sequence and structure reveals a resem- 
blance between archaebacteria and eucaryotes (129, 134, 
151; B. P. Kaine, unpublished results). Eucaryotes and 
eubacteria have similarities as well, e.g., ester-linked lipids 
(117, 118), but these are relatively few. Our present, rather 
limited understanding would suggest that the overall pheno- 
typic resemblance is greatest between the archaebacteria 
and eucaryotes. 



What the biologist, especially the microbiologist, must 
now fully recognize is that there no longer exists any reason 
to consider that archaebacteria and eubacteria are related to 
one another to the exclusion of eucaryotes. Unfortunately, 
the title of this review, "Bacterial Evolution,'* implies the 
opposite. The title is intended, however, merely as a cele- 
bration of the fact that within the last 10 years the field of 
bacterial evolution has passed from a suspect discipline, 
about which almost nothing was known, to a full-fledged 
area of scientific investigation, rich in its implications for all 
of biology. 

EUBACTERIAL PHYLOGENY 
Background 

Not only did we know very little about eubacterial 
phylogeny before the advent of the rRNA approach, but 
what we thought we knew tended to be wrong. The old idea 
(justified to some extent by cell wall compositions [183]) that 
there are two primary categories of (eu)bacteria, gram pos- 
itive and gram negative, turns out to be a half-truth. Gram 
positive is indeed a phylogenetically coherent grouping, but 
gram negative is not. The latter encompasses of the order of 
10 distinct groups, each the equivalent of the gram-positive 
one, as we will see. The old idea that wall-less bacteria, 
mycoplasmas, are phylogenetically remote from other 
(eu)bacteria (62, 176) is incorrect; the true mycoplasmas are 
merely "degenerate" Clostridia (see below). Photosynthetic 
bacteria do not form a grouping genealogically distinct from 
the nonphotosynthetic bacteria (92, 168, 169); actually, the 
major photosynthetic types each represent separate high- 
level phylogenetic units which in most cases include many 
nonphotosynthetic species as well (56, 64, 206, 266). 
Autotrophs and heterotrophs are not phylogenetically sepa- 
rate groupings; they are intimately intermixed within the 
various eubacterial phyla (56, 206). Thus, the textbook views 
in which photosynthetic (or autotrophic) bacteria arise from 
nonphotosynthetic heterotrophic ancestors (15, 139) gain no 
support from the rRNA-based phylogeny. It is the classical 
microbiologist's insistence on morphology as the primary 
criterion (108), a prejudice inherited from the botanist, that 
more than anything engendered the confused and confusing 
state of bacterial taxonomy; almost none of the taxa (beyond 
the level of genus) defined primarily in this way pass 
phylogenetic muster (56). 

The misconceptions of the classical microbiologist cannot 
be condemned. They were innocent attempts to create a 
phylogenetic framework at a time when phylogenetic classi- 
fication was not experimentally possible. Unfortunately, 
many of them are now formalized as accepted bacterial 
taxonomy: they are repeated over and over in textbooks. 
They shape the design and interpretation of our experiments. 
And, because this taxonomy is patterned after metazoan 
classification, it is a de facto phylogenetic statement. Such 
an entrenched, organized system will not change easily. Yet 
replacing the old taxonomy with a phylogenetically valid one 
is vital to the future development of microbiology. 

Major Eubacterial Groups and Their Subdivisions 

As of this writing over 500 species of eubacteria have been 
characterized in rRNA terms. Although most of these char- 
acterizations utilized the older and less informative oligonu- 
cleotide cataloging method, the more than 50 nearly com- 
plete sequences now known are sufficient in number that the 
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TABLE 1. Sequence signature distinguishing the two procaryotic kingdoms" 
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■ Based upon approximately 30 eubacteria] and 12 archaebacterial 16S rRNA sequences (19, 68. 224, 263, 274; C. R. Woese et al., unpublished data). Y, 
Pyrimidine; R, purine; N, any base; — , no base. 
* Position in 16S rRNA sequence. 

c Eucaryotic compositions taken from alignment used to construct Fig. 14; upper case, no exceptions; lower case, one exception. 
4 Upper case, Base found in all or all but one sequence in the kingdom but in no more than one sequence in the other kingdom. 
' Lower case, As for footnote d but with two to three exceptions. 



conclusions derived from the older approach can be signifi- 
cantly refined and extended. 

At the level of oligonucleotide analysis it was apparent 
that the bacteria separate into more or less naturally defined 
* ' phyla* * (see discussion below). This was not apparent from 
the binary association coefficients (S AB values), but could be 
seen in oligonucleotide signatures (266). (Such signatures are 
collection of specific oligonucleotides, each of which occurs 
in most or all members of a given phylogenetic group but 
rarely, if at all, in most other groups, especially closely 
related ones [266].) Full rRNA sequences now permit the 
identification of individual positions in the molecule, se- 
quence signatures that define the various groupings. We will 
use these rather than the older oligonucleotide signatures in 
the discussion below. 

In some cases taxa defined by rRNA sequences can be 
identified phenotypicaUy by common characteristics of the 
group. For the majority of such taxa, however, any given 
common character will link most, but not all, members of the 



group. A few of the rRNA groups are without phenotypic 
justification. The spirochetes and relatives (see below) seem 
to be a taxon of the first type; all identified members of the 
genotypically defined unit possess the unique and unusual 
axial fibrils classically associated with these organisms (24). 
The gram-positive phylum is an example of the second type; 
the majority of its species have characteristic gram-positive 
cell walls (183), but a few lineages, such as Heliobacterium 
(255) and the mycoplasmas (176), do not have them. An 
example of a genotypically defined unit for which no con- 
vincing phenotypic justification can be given is the bacte- 
roides-flavobacter phylum (166, 234); see below. In this case 
and others like it, the lack of common phenotypic properties 
could merely reflect the fact that the various subgroups have 
been characterized in entirely different ways and so would 
not necessarily be expected to show common characteris- 
tics. Such groups challenge the microbiologist to discover 
their unifying phenotypic motifs. 
The eubacteria! phyla and their subdivisions as they are 
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FIG. 7. Location of the signature positions that distinguish eubacteria from archaebactcria (Tabic 1) on the 16S rRNA secondary structure. 
The underlying secondary structure is that for Escherichia coti (260). The signature positions are indicated by filled circles. Positions whose 
composition is highly conserved (i.e. , the same in over 90% of sequences and oligonucleotide catalogs) between eu bacteria and archaebactcria 
are also indicated (as an aid in orientation); all others (in the E. coli sequence) have been replaced by dots. Tick marks show every 10th 
position in the (£. coti) sequences, every 50th being numbered. 
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now understood are listed in Table 2. Table 3 identifies them 
by a sequence signature. This key now includes mainly 
positions covered by the oligonucleotide catalogs, but will 
ultimately be extended to all positions in the 16S (and 23S) 
rRNA. 

In the following discussion each of the known eubacterial 
phyla (divisions) will be defined genealogically and its phe- 
notype will be briefly described. As was the case with the 
kingdoms themselves, three criteria will be used whenever 



TABLE 2. Eubacterial phyla and their subdivisions" 

Purple bacteria 
a subdivision 

Purple non-sulfur bacteria, rhizobacteria, agrobacteria, 
rickettsiae, Nitrobacter 
p subdivision 

Rhodocyclus, (some) Thiobaciltus, Alcatigenes, Spirillum, 
Nitrosovibrio 
■y subdivision 

Enterics, fluorescent pseudomonads, purple sulfur bacteria, 
Legionella* (some) Beggiatoa 
h subdivision 

Sulfur and sulfate reducers (Desuffovibrio), myxobacteria, 
bedellovibrios 

Gram-positive eubacteria 

A. High-G+C species 

Actinomyces, Streptomyces, Arthrobacter, Micrococcus, 
Bifidobacterium 

B. Low-G+C species 

Clostridium, Peptococcus, Bacillus, mycoplasmas 

C. Photosynthetic species 
Heliobacterium 

D. Species with gram-negative walls 
Megasphaera, Sporomusa 

Cyanobacteria and chloroplasts 
Aphanocapsa, Oscillatoria, Nostoc, Synechococcus , 
Gleoebacter, Prochloron 

Spirochetes and relatives 

A. Spirochetes 

Spirochaeta, Treponema , Borrelia 

B. Leptospiras 
Leptospira, Leptonema 

Green sulfur bacteria 
Chtorobium, Chloroherpeton 

Bacteroides, flavobacteria, and relatives 

A. Bacteroides 
Bacteroides, Fusobacterium 

B. Flavobacterium group 

Flavobacterium, Cytophaga, Saprospira, Flexibacter 

Planctomyces and relatives 

A. Planctomyces group 
Planctomyces , Pasteuria 

B. Thermophiles 
Isocystis pallida 

Chlamydiae 
Chlamydia psittaci, C. trachomatis 

Radioresistant micrococci and relatives 

A. Deinococcus group 
Deinococcus radiodurans 

B. Thermophiles 
Thermus aquatic us 

Green non-sulfur bacteria and relatives 

A. Chloroflexus group 
Chloroflexus, Herpetosiphon 

B. Thermomicrobium group 
Thermomicrobium roseum 

* Showing representative examples. See appropriate sections in text for 
references. 



possible to define a phylum and its subdivisions: (i) coher- 
ence of the unit by sequence distance analysis (or cluster 
analysis of values [55]); (ii) definition of the unit by 
sequence signature; and (iii) characterization of the unit in 
terms of higher-order structural features of 16S rRNA. 

Purple Bacteria 

What for want of a better name have been called the 
"purple bacteria" contain most, but not all, of the traditional 
gram-negative bacteria (50, 51, 56, 64, 163 , 254, 266-269). 
However, the arrangement of classically defined families, 
genera, and even species within this phylum is a jumbled 
one. Photosynthetic species group with nonphotosynthetic 
ones; heterotrophs are paired with chemolithotrophs; 
anaerobes are paired with aerobes, etc. Because the purple 
photosynthetic phenotype is distributed more or less 
throughout the group, and because photosynthesis is com- 
plex enough that its having arisen more than once is unlikely, 
the ancestral phenotype of the phylum is undoubtedly (pur- 
ple) photosynthetic, which justifies the name purple bacte- 
ria. Photosynthetic capacity has been lost many times in this 
phylum, resulting in various nonphotosynthetic sublines. 
(The alternative explanation, that photosynthetic capacity 
was genetically transferred among species, is not supported 
by the evidence [258].) 

The purple bacteria fall rather naturally into four subdivi- 
sions, which, awaiting appropriate formal nomenclature, are 
designated a, p, 7, and 8. Figure 8 is a distance matrix tree 
for the purple bacteria, based upon five representative 
sequences in each subdivision. The corresponding tree given 
by maximum parsimony analysis (not shown) agrees with the 
branching order shown in Fig. 8, except for some details of 
branching in the 8 subdivision. Table 4 distinguishes the four 
subdivisions by sequence signature. The purple bacteria as a 
whole, however, make up one of the few phyla that cannot 
be defined by a simple signature (see Table 3), although 
parsimony analysis as well as sequence distance treeing 
readily define the group (267). 

Two helices in 16S rRNA, positions 184 to 193 and 198 to 
219 (E. coli numbers) (260), help to define and distinguish the 
four subdivisions; see Fig. 9. While the structure of these 
helices can vary drastically between subdivisions, within a 
given subdivision each remains constant to a first approxi- 
mation. The first of the two helices, i.e., positions 184 to 193, 
contains only 3 base pairs in a, 0, and y subdivision 
sequences, but about 10 base pairs in the 8 subdivision 
sequences. The short form of the helix is rare. So far it has 
been found outside the purple bacteria in only two other 
phyla, the cyanobacteria (224) and the planctomyces (H. 
Oyaizu, unpublished data). In all other cases, including the 
archaebacteria (72, 96, 124, 126, 158), a much longer version 
of the helix occurs, which therefore is likely to be its 
ancestral form. 

The second helix, positions 198 to 219, contains approxi- 
mately 8 base pairs in all, except the a-subdi vision se- 
quences, wherein 2 base pairs only make up the stalk (260, 
274). However, for this helix the short version is common 
among eubacteria and even occurs in archaebacteria, while 
the longer version is rare (among eubacteria). The long 
version has a common structure in most sequences from the 
p and y subdivisions, but the characteristic 8-subdivision 
(long) version differs from this structure in detail (see Fig. 9). 

The (near) constancy of structure for each helix within a 
given subdivision can be inferred from oligonucleotide cat- 
alog data. For example, 80% of a-purple bacterial catalogs 
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TABLE 3. 16S rRNA sequence signature for the eubacterial phyla and their subdivisions" 
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* Abbreviations not explained are obvious from text. All positions are based upon oligonucleotide data (266) except 353 (233). 

* Same composition as consensus (•). 

f Composition upper case— major base; if no other specified, then it accounts for >90% of assay able cases. 

4 Composition lower case— minor occurrence base; found in <1596 of assayable cases (or in only one species for groups containing seven or less species). 
' — , No nucleotide at this position. 

f Composition in parentheses— based upon one example only. 



contain the octanucleotide (G)AUUUAUCG, which entirely 
covers the version of the second helix (positions 198 to 219) 
found in this subdivision (266, 267). (G)AYCUUCG, which 
forms part of this helix in the 0 and *y subdivisions, is found 
in 44% of y catalogs and 13% of those from the 0 subdivision 
(267-269). (Although all known 16S rRNA sequences from 0 
species have the structure for this helix shown in Fig. 9, the 
loop often contains a G residue, which breaks up the [Ti 
ribonuclease] oligonucleotide otherwise characteristic of the 
structure [unpublished analysis].) Oligonucleotides of the 
form (G)CCUCU. . . , seen in the 8-purple bacterial version 
of this helix, occur in 78% of S-subdivision catalogs (163). 

With regard to the first helix, positions 184 to 193, oligo- . 
nucleotides representing the structure can be identified in 
only 35 and 26% of cataloged species, respectively, from the 
0 and y subdivisions (unpublished analysis). However, ev- 
ery one of the sequenced 16S rRNAs from these two 
subdivisions show the characteristic 3-base pair structure 
(unpublished analysis). Based upon oligonucleotide and se- 
quence evidence, at least 83% of a-purple bacteria must also 
possess the same form of this helix (unpublished analysis). 



Figure 8 shows the 0 and y subdivisions to be relatively 
closely related to one another, a fact that can also be 
demonstrated with other types of analysis (163); see Table 4. 
The common 0-^ lineage and the a- and 5-subdivision 
lineages appear to have split from one another in such rapid 
succession that their branching order cannot be resolved by 
the evidence that now exists. However, the higher-order 
structural evidence, specifically the (derived) form of the 
helix covering positions 184 to 193 (Fig. 9), would suggest 
that the a, 0, and y subdivisions form a grouping that 
excludes the 5 subdivision. 

Table 5 through 8 lists representative genera and species in 
each of the four subdivisions in a rough phylogenetic ar- 
rangement. Three of the four subdivisions contain photosyn- 
thetic species: the a-purple bacteria seems predominantly 
photosynthetic, and the 0 subdivision shows photosynthesis 
in its two main subgroups, while in the 7 subdivision 
photosynthesis appears to be confined to one of its three 
main subgroups. 

a-Purple bacteria. The intimate juxtaposition of photosyn- 
thetic and nonphotosynthetic species in the a subdivision 
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FIG. 8. Phylogcnctic tree for the purple bacteria based upon 16S rRNA sequences. The tree was constructed (36) from an evolutionary 
distance matrix (99), generated from an alignment (260) of five representative sequences from each of the four purple bacterial subdivisions. 
(Only positions represented in all sequences were used in the calculation.) The root was determined by using several eubacterial outgroup 
sequences. The sequences used were as follows, a subdivision: 1, Rhodospirillum rubrum (Woese et al., unpublished data); 2, Agrobacterium 
tumefaciens (275); 3, Rhodopseudomonas palustris (Woese et al., unpublished data); 4, Rhodopseudomonas acidophila (Woese et al., 
unpublished data); 5, Rhodobacter capsulatum (Woese et al. f unpublished data), p subdivision: 1, Neisseria gonorrhoeae (Woese et al., 
unpublished data); 2, Spirillum volutans (Woese et al., unpublished data); 3, Nitrosolobus multiformis (Woese et al., unpublished data); 4, 
Rhodocyclus gelatinosa (Woese et al., unpublished data); 5, Rhodocyclus pupureus (Woese et al., unpublished data), 7 subdivision: 1, 
Chromatium vinosum (Woese et al., unpublished data); 2, Legionella pneumophila (Woese et al., unpublished data); 3, Pseudomonas 
aeruginosa (Woese et al., unpublished data); 4, Acinetobacter calcoaceticus (Woese et al., unpublished data); 5, Escherichia coli (19). 8 
subdivision: 1, Myxococcus xanthus (163); 2, Desulfovibrio desulfuricans (163); 3, Bdellovibrio stolpii (Woese et al., unpublished data); 4, 
Desulfotobacter postgatei (Woese et al., unpublished data); 5, Oesulfuromonas acetoxidans (Woese et al., unpublished data). Outgroup 
sequences: Bacillus subtilis (68) and Thermotoga maritima (1). 



suggests a more or less continual evolution of the latter from 
the former (267). Aerobic metabolism also appears to have 
arisen a number of times in this subdivision alone (267). The 
close association of species reducing and oxidizing nitrogen 
compounds, e.g., Rhodopseudomonas palustris and Nitro- 
bacter winogradskii (267), suggests some sort of evolution- 
ary connection between the two metabolisms. The metabolic 
richness and diversity of species evolving from purple pho- 
tosynthetic ancestry in this subdivision are remarkable. 
' The a subdivision is also of general biological interest 
because certain of its members have interesting associations 
with various eucaryotes. The rhizobacteria (essential for 
nitrogen fixation in legumes), the agrobacteria (pathogenic 
for plants) (267, 274), and the rickettsias (intracellular patho- 
gens of animals) (235) form a tight cluster within subgroup 
a-2. Sequence differences in 16S rRNA among members of 
this cluster are under 7% (235). These particular species 
have in common the tendency to form intimate, if not 
intracellular, associations with eucaryotic cells. It is then no 
great surprise to learn that the endosymbiont that gave rise 
to (most, if not all) eucaryotic mitochondria was itself a 
member of the a subdivision (274). 

0-Purple bacteria. Most of the characterized p-purple 
species fall into two main subgroups. However, poorly 
defined deeper branchings are also evident, represented, for 
example, by Spirillum and Neisseria (269). This subdivision 
is a potpourri of classical genera (Table 6), some of which are 
not even phylogenetically coherent within the subdivision 
(269). The p-photosynthetic species, recently reclassified in 
the genus Rhodocyclus (92), are quite distinct from other 
purple nonsulfur bacteria, i.e., those residing in the a 
subdivision. Over and above their rRNA sequence differ- 



ences, p species differ from their a counterparts in cyto- 
chrome c type; p cytochromes are of the small-subunit type, 
while cytochromes from the a subdivision are of the medium 
or large type (37). Moreover, photoreaction centers in pho- 
tosynthetic p species have a structure distinct from that seen 
among a species (27). 

V-Purple bacteria. The -y-purple bacteria (Table 7), the 
most extensively characterized of the four subdivisions of 
purple bacteria, is again a mixture of phenotypes (268): 
photosynthetic with nonphotosynthetic, aerobic with 
anaerobic, heterotrophic with chemoltthotrophic, etc. Oligo- 
nucleotide catalog analysis divided the 7-purple bacteria into 
three main subgroups: one containing mainly photosynthetic 
species of the purple sulfur type, e.g., Chromatium (50, 198); 
a second known to contain only species associated with 
Legionnaires disease (132); and a third that is a mixture of 
nonphotosynthetic genera from the enterics, vibrios, 
oceanospirilla, the fluorescent pseudomonads and relatives, 
and others (268). As additional complete 16S rRNA se- 
quences become available, it begins to look as though the p 
subdivision may ultimately be shown to be a subgroup within 
the 7 subdivision (albeit deeply branching). In any case its 
close association with the 7 subdivision is surprising given 
that the p-purple bacteria have such a distinctive signature; 
see Table 4 (266). 

6-Purple bacteria. The 5-purple bacteria (Table 8) subdi- 
vision harbors three disparate phenotypes: the sulfur and 
sulfate-reducing eubacteria, the myxobacteria and relatives, 
and the bdellovibrios (163). At the present writing the 
phylogenetic detail within the subdivision is not clear. The 
myxobacteria and relatives indeed form a coherent grouping, 
a clade (130). The bdellovibrios probably form one as wel|, 
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TABLE 4. Sequence signature distinguishing the four subdivisions of purple bacteria" 
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a Compositions based upon an alignment of 31 16S rRNA sequences from purple bacteria and oligonucleotide catalogs (19, 51, 64, 83, 163, 198, 267-269, 274; 
Woese, unpublished analysis). Y, Pyrimidine; R, purine; N, any nucleotide. — , Position does not exist in species so indicated. 
* Catalog information was also used in determining composition. 

c Consensus composition for the other eubacterial phyla; no clear consensus is indicated by "N". 

4 Upper case. Major base; if no other specified, it accounts for >90% of assay able cases. 

' Lower case, Minor occurrence base; found in <15% of assayable cases (or in only one sequence in group). 



a 




• - 


• -I40 


• - 


• 


Z20 N • ~ 

u A uc G 


• 

u.„. 


U UAG 

200 A A 
A 

• G 

R - 
G - 
R - 


1 

a"-~ 

U-180 

R 

C 
Y 


190- • 

• 


Y 

• 



220 



0/Y 

• " •- 

Y - R 

Y - R 



ZI \ R GR CCU««Y 
ii i i i i i i 
•.CY G G G • • G 

AG 200 a A 

A 



U-'BO 

R 



R ■ 

190- R 



"V 



• - •- 

• - • 

- R 



, »U CGtiRG 

C U C zio A R 



..... 



• 6 _ 

U-<80 
A 



FIG. 9. Differences in higher-order structural detail among the various subdivisions of the purple bacteria for the region of 16S rRNA 
between positions 180 and 220 (163). Composition of a position is given when it is invariant or highly conserved within a subdivision, but it 
is shown as a dot otherwise. Base pairs are indicated by connecting lines. 
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TABLE 5. Characterized species of g-purple bacteria (64, 269)° 

Subgroup a-1 
RhodospiriUum rubrum 

R. photometricum 
R. molischianum 
R> fulvum 

Rhodopseudomonas globtformis 
Aquaspirillum itersonii 
Azospirillum brasilense 

Subgroup a-2 
Rhodomicrobium vannielii 
Rhodopseudomonas viridis 
Rhodopseudomonas palustris 

Nitrobacter winogradskyi 
Rhozobium leguminosarum 

Agrobacterium tumefaciens* 

Rochalimaea quintana 
Rhodopseudomonas acidophita 
Pseudomonas diminuta 
Phenylobacterium immobile 

Subgroup a-3 
Rhodobacter capsulatum 
Rhodobacter sphaeroides 
Paracoccus denitrtficans 
Manganese oxidizers (2 strains) 

Subgroup a-4 
Erythrobacter longus 

" Indentation indicates specific relationship; for example, Rhizobium legu- 
minsoraum, Agrobacterium tumefaciens, and Rochalimaea quintana are 
specific relatives of one another, to the exclusion of the other species in their 
subgroup. 



TABLE 6. Characterized species of ft-purple bacteria (64, 269)° 

Subgroup 0-1 
Rhodocyclus gelatinosa 

Sphaerotilus natans 
Pseudomonas testosteroni 

P. acidovorans 

Aquaspirillum gracile 

A. aquaticum 

Comamonas terrigena 
ThiobaciUus intermedius 

Subgroup 0-2 
Rhodocyclus tenue 

R. purpureus 
Aquaspirillum dispar 
A. serpens 

A, bengal 
Chromobacterium violaceum 
Chromobacterium lividum 
Alcaligenes faecalis 
Alcaligenes eutrophus 
Pseudomonas cepacia 
ThiobaciUus denitrificans 
Vitreoscilla stercoraria 

Subgroup p-3 6 
Spirillum volutans 
Nitrosomonas europaea 
Nitrosococcus mobile 
Nitrosolobus multiformis 

Nitrosovibrio tenuis 

Nitrosospira sp. 
Neisseria gonorrhoeae 

* Indentation indicates specific relationship. 

* Paraphyletic group, defined only by exclusion from subgroups 1 and 2. 
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TABLE 7. Characterized genera or species of -y-purplc bacteria 
(50, 64, 132, 26oT 

Subgroup 7-1 
Chromatium 

Thiocapsa 

Thiocystis 

Thiodictyon 

Thiospiritlum 

Lamprocystis 
Nitrosococcus oceanus 
Ectothiorhodospira 

Subgroup 7-2 
Legionella 

Subgroup 7-3 
Fluorescent pseudomonads 6 

Xanthomonas 
Lysobacter 

Acinetobacter 
Oceanospiriltum 
Alcaligenes putrifasciens 
Pasteurella multocida 
Aeromonas hydrophila • 
"Bacteroides" amylophilus 
Enterics, vibrios, and photobacteria 
Hatomonas elongata 

"Ftavobacterium" helmephilum 
Leucothrix mucor 
Beggiatoa leptomitiformis 

* Indentation indicates specific relationship. 

* Includes Pseudomonas aeruginosa, P. alcaligenes, P. fluorsecens, P. 
putida, P, stutzeri, P. syringae, /*. pseudoalcaligenes, and Serpens fltxibilis. 



TABLE 8. Characterized genera and species of o-purple bacteria 
(51, 83, 130, 163)° 

Myxococcus group 
Myxococcus 

Cystobacter fuscus 

Stigmatella aurantiaca 
Sorangium cellulosum 
Nannocystis exedens 

Bdellovibrio group 
Bdetlovibrio stolpii 

Bdellovibrio starii 
Bdellovibrio bacteriovorus 

Sulfur and sulfate reducers 
Desulfovibrio desulfuricans 
Desulfuromonas 
Desulfotobacier postgatei 

Desulfosarcina variabilis 

Desulfonema limicola 
Desulfobulbus propionicusll 

* Indentation indicates specific relationship. 

but the fact that Bdellovibrio bacteriovorus is relatively 
rapidly evolving makes its exact placement uncertain (83). 
Whether the myxobacteria and relatives or the bdellovibrios 
(or both) arise outside of the group defined by the sulfur and 
sulfate reducers remains unresolved (unpublished analysis). 
In any case it would appear that the myxobacteria and 
bdellovibrios represent aerobic adaptations of some ances- 
tral anaerobic sulfur-metabolizing phenotype. 

Gram-Positive Eubacteria 

Cell wall type distinguishes the gram-positive eubacteria 
from the others (56, 183). However, as mentioned above, 



Vol. 51, 1987 



BACTERIAL EVOLUTION 



241 



TABLE 9. Characterized genera and species of gram-positive bacteria* 



High-G+C subdivision 
Bifidobacterium 
Propionibacterium 
Actinomyces 
Arthrobacter 
Micrococcus 
Dermatophilus 
Ceiluiomonas 
Derskovia 
Nocardia celluians 
Microbacterium 

Corynebactehum (plant associated) 
Brevibacterium linens 
Streptomyces 
Kitasatoa 
Chainia 

M icroellobosporia 

Strep toverticitlium 
Actinomadura 
Streptosporangium 
Thermomonospora 
Mycobacterium 

Nocardia 

Brevibacterium ketoglutamicum 
Corynebacterium 
Geodermatophilus 
Frankia 

Dactylosporangium 
Ampurariella 
Actinoplanes 
Micromonospora 
Arthrobacter simplex 

Photosynthetic subdivision 
Heliobacterium 

Species with gram-negative walls 
Megasphaera 
Selenomonas 
Sporomusa 



Low-G+C subdivision 
Bacillus 

Planococcus 
Sporotactobacillus 
Sporosarcina 
Thermoactinomyces 
Staphylococcus 
Lactobacillus 
Pediococcus 
Leuconostoc 
Streptococcus 
Mycoplasma 
Acholeplasma 
Spiroplasma 
Anaeroplasma 
Clostridium innocuum 
Erysipetothrix 
Clostridium pasteurianum 
C, butyricum 
C. scatotogenes 
Sarcina ventricutae 
Clostridium oroticum 
C. indolis 
C. aminovalericum 
Butyrivibrio fibrosolvens 
Clostridium lituseburense 
Eubacterium tenue 
Peptostreptococcus 
C. aceticum 
C. acidiurici 
C. purinolyticum 
Clostridium barkeri 

Eubacterium timosum 
Acetobacterium woodii 
Clostridium thermosaccharotyticum 
C. thermoaceticum 
Acetogenium 

Thermoanaerobium 
Peptococcus 
Ruminococcus 



" Approximate phylogenetic clustering suggested by indentation. References 56, 131, 199-201, 203-207. 220. 221, 255. 262, 264. 



there are several important exceptions whose walls are not 
gram positive. The phylum appears to consist of four subdi- 
visions, only two of which are well characterized. These two 
are readily distinguished on the basis of DNA composition. 
The one includes species whose DNAs all contain more than 
55% guanine plus cytosine (G+C); the other is made up of 
species whose DNAs contain <50% (56, 201, 204). The 
recently discovered phototroph Heliobacterium chlorum 
(58, 60, 255) is the only characterized representative of the 
third subdivision, while the genera Megasphaera, Seleno- 
monas , and Sporomusa constitute the fourth (203). Members 
of the third and fourth subdivisions do not have gram- 
positive cell walls (60, 203). 

Species in the high-G+C gram-positive subdivision con- 
form to a general actinomycete phenotype: they tend to be 
pleomorphic, form branched filaments, etc. (200, 201, 
204-206). Most are aerobic, with the exception of the deeper 
branches, e.g., the bifidobacteria. The group as a whole is 
not particularly deep; by oligonucleotide catalog measure all 
high-G+C gram-positive bacteria are no further from each 
other than are Bacillus species from those of Lactobacillus, 
for example. The lowest 5^ values in the group correspond 
roughly to sequence similarities in the range of 85% (56, 201, 
204). The group, therefore, would seem not to be a particu- 
larly ancient one. 

Species in the low-G+C gram-positive subdivision con- 
form by and large to a clostridial phenotype. They tend to be 



anaerobic, rod shaped, endosporeforming, etc., although a 
number have lost one or more of these characteristics. In 
contrast to the high-G+C group, the low-G+C gram-positive 
bacteria form a phylogenetically deep, and therefore pre- 
sumably ancient, cluster (56). 

Table 9 lists some of the characterized gram-positive 
genera and species in a crude phylogenetic ordering. 

The gram-positive rRNAs sequenced so far are not 
broadly representative enough to permit the construction of 
a phylogenetic tree for the entire phylum. However, the 
phylum is easily defined by cluster analysis of S*b values 
based upon oligonucleotides (56, 206). Although few in 
number, the signature positions characterizing the gram- 
positive eubacteria (Table 3) are strong ones. Note in par- 
ticular the presence of a C residue at position 1207, which 
occurs in all cataloged gram-positive bacteria (roughly 150 
species) and in cyanobacteria, but nowhere else among 
eubacteria (266). The A residue at position 1198 is present in 
roughly 75% of gram-positive species, but occurs elsewhere 
only in the bacteroides phylum and occasionally among the 
spirochetes (266). An absolute requirement for an A residue 
at position 513 also characterizes the gram-positive bacteria, 
but few other phyla. 

Higher-order structure in 16S rRNA also helps to identify 
gram-positive bacteria. Two characteristic adjacent A-G 
pairs, positions 1425-6 to 1474-5, occur in the penultimate 
helix of almost all gram-positive 16S rRNAs sequenced to 
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TABLE 10. Sequence signature distinguishing gram-positive subdivisions' 
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* References: low-G+C subdivision (131, 199. 203, 220, 221); high-G+C subdivision (200, 201, 204, 205, 207); sporomusa group (203); Heliobacierium chlorum 
(255). 

* Position in sequence. 

c Composition of position (percentage of oligonucleotide catalogs having an oligonucleotide covering the position which shows the indicated composition); 
failure to find an oligonucleotide covering a position in some catalogs causes percentage compositions to sum to less than 100%. 
4 Composition characteristic of position in the (vast) majority of other eubacterial phyla. 



date (260; unpublished data); see Fig. 10. This arrangement 
has yet to be seen outside this group. (We will encounter 
G-A pairs at this location in another phylum, however.) 
Oligonucleotide catalog data indicate that these adjacent 
A-G pairs are common among gram-positive bacteria. The 
general sequence (G)YAAYACCC (which includes the two 
adjacent A's of the A-G pairs in question) is found in 85% of 
catalogs from the high-G+C group and in 67% of those from 
the low-G+C group, but occurs nowhere else among 
eubacteria (266). 

The sequence signature of Table 10 distinguishes between 
the two main gram-positive subdivisions. In all cases the 
composition characteristic of the low-G+C subdivision is 
that found in most other phyla; it is undoubtedly, therefore, 
the ancestral composition. This would suggest that the 
high-G+C gram-positive lineage has been subject to rapid 
evolution; see discussion below. 

Genealogical substructure within the two major subdivi- 
sions can be seen (56, 200, 201, 204-206), although it will not 
be systematically detailed here. In the high-G+C subdivi- 
sion, the deepest branchings are defined by anaerobic spe- 
cies, the bifidobacteria and the propionibacteria. The low- 
G+C subdivision shows at least Ave major branches, most of 
which contain Clostridia. 



One subline in the low-G+C subdivision has given rise to 
four groups of particular interest: Bacillus, Lactobacillus, 
Streptococcus, and the mycoplasmas and their relatives (56, 
262). The subline is an interesting evolutionary study, for its 
evolution in a way parallels the development of aerobic 
conditions on the planet. Bacillus species are basically 
aerobic, though a few also grow well anaerobically. Lacto- 
bacillus, Streptococcus, and the mycoplasmas are basically 
anaerobic, but tolerate and in some cases even utilize a little 
oxygen. Their clostridial relatives are true anaerobes. The 
evolutionary radiation that spawned the various groups 
might well have occurred during the microaerobic period in 
earth history (see below), with Bacillus then becoming fully 
adapted to an oxygen atmosphere. (The exact order of 
branching among these groups will not be known until 
sequences representing all four have been determined.) The 
mycoplasma group will be considered in detail below. 

Cyanobacterla 

The cyanobacteria, the classical blue-green algae, are a 
group of procaryotes defined by the common possession of 
chlorophyll a. They form a phylogenetically coherent unit 
(12, 40, 56) that contains no known nonphotosynthetic 
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FIG. 10. Higher-order structure for the gram-positive bacteria in 
the region of 16S rRNA between positions 1410 and 1480 (68, 93, 
163). Composition of a position is given when it is invariant or highly 
conserved within the gram-positive bacteria, but it is shown as a dot 
otherwise. Base pairs are indicated by connecting lines; A-G pairs 
are shown by open circles. Arrow shows adjacent A-G pairs 
discussed in text. 



representatives. However, the phylum does include Pro- 
chloron (191), an organism that, like green plant chloro- 
plasts, possesses chlorophyll b as well as chlorophyll a (127). 
Green algal chloroptasts trace their ancestry to this phylum, 
as expected (11, 41, 67, 189, 223). Analysis of oligonucleo- 
tide catalog data did not resolve the questions of whether the 
chloroplasts (whose rRNAs appear somewhat rapidly evolv- 
ing) arose from within or from just outside of the cluster 
defined by extant cyanobacteria and whether these organel- 
les are related to Prochloron to the exclusion of the 
cyanobacteria. Although about 25 complete sequences of 
cyanobacterial 16S rRNAs have now been determined (but 
as yet not that from Prochloron)^ these questions remain 
unresolved as of this writing (S. J. Giovannoni, S. Turner, 
G. J. Olsen, D. J. Lane, and N. R. Pace, manuscript in 
preparation). The precise origin of the brown algal 
chloroplasts and certain others also has yet to be deter- 
mined. However, the rRNAs of the chloroplasts from the red 
algae (10) and Euglena sp, (276) are known to be closely 
related to those of cyanobacteria. 

The sequence signature for the cyanobacteria and 
chloroplasts (Table 3) is small but significant. Position 799, a 
G residue in all but 1 to 2% of other eu bacterial catalogs, is 
U or A in all cyanobacterial sequences or catalogs, including 
Prochloron (191); G's are encountered in some chlorpplast 
sequences, however (276). Position 1207, a G residue in all 
other eubacteriaj catalogs, is always C in the gram-positive 
bacteria and cyanobacteria (including Prochloron and the 
chloroplasts), except for one cyanobacterial sequence (191; 
Giovannoni et al., in preparation). Very few eubacteria show 
A at position 1233 (266); however, all cyanobacteria (except 
one) and Prochloron do (191; Giovannoni et al., in prepara- 



tion). (Most chloroplast examples do not [10, 41, 67, 189, 
223, 276].) 

Green Sulfur Bacteria 

The four cataloged species of the green photosynthetic 
bacteria covering the genera Chlorobium and Chloroher- 
peton form a relatively tight phylogenetic unit, especially in 
view of the fact that their phenotype is generally considered 
a very primitive one (63, 266). Since the characterized 
species seem representative of the known spectrum of green 
sulfur bacteria, the question of why the group is relatively 
shallow becomes nontrivial. As will be seen below, the green 
sulfur bacteria are not related to the other green type of 
photosynthetic bacteria, the so-called green non-sulfur bac- 
teria (63). 

The signature characteristic of the phylum (Table 3) con- 
tains positions 995, 1234, and 1410! The one sequence now 
available for the group (W. G. Weisburg, Ph.D. thesis, 
University of Illinois, Urbana, 1986) suggests that additional 
strong signature positions will appear when more sequences 
from the group are known, e.g., the lack of a base in the 
vicinity of position 1167 and the insertion of a base after 
position 1174 (Weisburg, Ph.D. thesis). 

Spirochetes 

Spirochetes are one of the few groupings correctly identi- 
fied by classical (morphological) criteria (167). Their com- 
mon spiral shape and axially coiled fibrils, lying between 
inner and outer cell envelopes, are strikingly characteristic 
(24). Table 11 shows representative species whose rRNAs 
have been characterized, in rough phylogenetic arrange- 
ment. The sequence signature for the group shown in Table 
3 is quite distinctive. For example, the U residue at position 
47 found in all species from this group occurs nowhere else 
among eubacteria (266; unpublished analysis). The same 
holds for the A residue at position 52, and the C residue at 
position 1415, while universal among spirochetes, is other- 
wise extremely rare among eubacteria (266). 

Two clearly separated subdivisions exist within the phy- 
lum: one composed of the leptospiras and the other contain- 
ing spirochetes, treponemes, and the like (167 , 266). The 



TABLE 11. Characterized species of spirochetes and relatives* 

Spirochete subdivision 
Spirochaeta halophila 
S. aurantia 
S. litoralis 
S. isovalerica 
Treponema succintfaciens 
T. bryantii 
T. denticola 

T. phagadensis 
S. stenostrepta 
5. zuelzerae 
T. pallidum 
Borretia hermsii 
T. hyodysenteriae 

Leptospira subdivision 
Leptospira patoc 

Leptospira interrogans 
Leptohema illini 

* Reference 167; Oyaizu, unpublished data; Weisburg et al., unpublished 
data. Indentation indicates specific relationship. 
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TABLE 12. Characterized species in the bacteroides- 
flavobacterium phylum (166. 234) fl 

Bacteroides subdivision 
Bacteroides fragilis 

B. ova t us 

B. untfbrmis 

B. asaccharotyticus 

B. vulgatus 
B. ruminicola 

B. melaninogenicus 
B. distasonis 

Flavobacterium subdivision 
Flavobacterium aquatile 

Cytophaga johnsonae 
Sporocytophaga myxococcoides 

Cytophaga tytica 

F. utiginosum 

F. breve 
Flavobacterium keparinum 

F. ferrugineum 

F. etegans 

Saprospira grandis 

Haliscomenobacter hydrossis 

Unnamed subdivision 
Unnamed anaerobic flexible rod; strain Pl-12fs 

* Indentation indicates specific relationship. 



classical taxonomic distinction between spirochetes and 
treponemes, however, does not hold up (167); the two types 
form a genealogically intermixed unit. The lone species 
Treponema hyodysenteriae \ however, represents a lineage 
distinct from the main spirochete-treponeme cluster (167), 
and the genus Borrelia is slightly peripheral to the main 
cluster as well (167). 

It was previously suggested that the Haloanaerobiaceae , 
unusual anaerobic halophilic eubacteria, belong to this phy- 
lum (161), but full sequence information fails to confirm this 
(A. Oren and C. R. Woese, unpublished). 

Bacteroides, Flavobacteria, and Their Relatives 

The phylum made up of bacteroides and flavobacteria is an 
unexpected mixture of anaerobes, the bacteroides, and 
various aerobes, from genera such as Flavobacterium, Cyto- 
phaga, and others (166, 234). Table 12 lists some of its 
characterized representatives. 

The grouping is cleanly defined by both sequence distance 
and parsimony analysis of rRNA sequences (234). Table 3 
shows its quite distinctive signature. Note, for example, the 
U residue at position 570 (found in all members of this and 
the planctomyces phylum, but nowhere else among the 
eubacteria), the A residue at position 995 (otherwise a C, 
except in the green sulfur bacteria), and the A at position 
1532 (which sets this group apart not only from all other 
eubacteria, but from archaebacteria and eucaryotes as well 
[5, 166, 259]). Positions 570 and 866 are involved in a 
recognized " tertiary structural** interaction and so vary in 
concert (74). 

A higher-order structural feature so far unique to this 
group is a series of three contiguous G-A pairs involving 
positions 1424-6 with 1474-6 (234). (Recall the two contigu- 
ous A-G pairs in this region characteristic of the gram- 
positive phylum; Fig. 10.) Common oligonucleotide se- 
quences indicate that most if not all organisms in this phylum 
share the property (166). 



The phylum's two major subdivisions separate the 
anaerobic Bacteroides species from the aerobic ones (166). 
The tack of phenotypic resemblance between members of 
the two subdivisions is remarkable, but may reflect only the 
fact that the bacteroides have been studied one way and the 
flavobacteria and relatives another. Most Bacteroides and 
(at least) a few species of Flavobacterium possess sphingo- 
lipids, compounds otherwise rare among eubacteria (234). 
An organism (yet unnamed) whose phenotype seems inter- 
mediate between these two has been isolated by K. O. 
Stetter (166); it is a strickly anaerobic flexible rod. Its 
phylogenetic position is also "intermediate" (166), and so 
the organism probably represents a third uncharacterized 
subdivision. 

A sequence signature (derived from oligonucleotide cata- 
logs) distinguishing the two subdivisions of the phylum is 
shown in Table 13. In most if not all cases the flavobacteria! 
version would appear to be the ancestral one for the group, 
in that it is the one found in most of the other eubacterial 
phyla (166). 



Planctomyces and Their Relatives 

Species variously placed in the genera Planctomyces, 
Pasteuria, and Pirella (186, 202), together with the recently 



TABLE 13. Sequence signature distinguishing the bacteroides- 
flavobacterium subdivisions 0 
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* References 166 and 234. See footnotes biod xn Table 10. B. Bacteroides 
subdivision; F, flavobacterium subdivision. 
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described hot spring organism Isocystis pallida (S. J. 
Giovannoni, in J. G. Holt, ed., Bergey's Manual of System- 
atic Bacteriology, vol. 3, in press; S. J. Giovannoni and H. 
Oyaizu, unpublished data), define this phylum. All are noted 
for the fact that their cell walls contain no peptidoglycan 
(110). 

In terms of oligonucleotide catalogs, this is the most 
unique of eubacterial groups. Their S AB values with other 
eubacteria are generally in the range of 0.10 to 0.15 (202), far 
lower than typical S AB values between eubacterial phyla, 
which are normally 0.20 to 0.25 (Fig. 2). Oligonucleotide 
catalogs for the species in this group contain fewer of the 
highly conserved (ancestral) oligonucleotides than do those 
from any other phylum (202). As might then be expected, the 
group possesses a strong signature (Table 3). 

The remarkable distance of the planctomyces group from 
other eubacteria measured by the rRN A cataloging approach 
was initially interpreted to mean that these organisms repre- 
sent by far the deepest branching in the eubacterial line of 
descent (202). Analysis of full sequences, however, does not 
bear this out (H. Oyaizu and C. R. Woese, unpublished 
data); these large sequence distances are due to rapid 
evolution of the lineage, not an especially early divergence 
from the common line of eubacterial descent; see Fig. 11. 

Chlamydiae 

The two known 16S rRNA sequences representing Chla- 
mydia, i.e., from Chlamydia psittaci and C. trachomatis, are 
closely related; they show <5% difference (233). Since no 
other even moderately close relatives of these organisms are 
known, the phylum cannot yet be considered adequately 
described. A distant relationship between the chlamydiae 
and the planctomyces group is suggested by sequence sig- 
nature (233). Of the sequence positions in Table 3 character- 
istic of the planctomyces and their relatives, five, i.e., 47, 48, 
52, 53, and 353, are also found in the C. psittaci and C. 
trachomatis sequences (233). (Other positions in the 16S 
rRNA sequence, not shown in Table 3, that link the chla- 
mydiae to the planctomyces and relatives are 110, 331, and 
361 [233].) 

Nevertheless, the chlamydiae should be considered to 
represent a distinct phylum, for the similarity between their 
16S rRNAs and those of the planctomyces is too slight to 
place the two in the same taxon. Their sequence similarity, 
71 to 72%, is considerably less than the 78% between 
Planctomyces stateyi and its relative Isocystis pallida (un- 
published calculation). (The 71 to 72% figure is not the 
artificial result of a relatively rapid evolution in the chlamyd- 
ial lineage, for chlamydial sequences are closer to outgroup 
sequences than are their counterparts in the planctomyces 
phylum. In addition, their S AB values with other eubacteria 
are not as abnormally low as those of the planctomyces [202; 
unpublished calculation].) Interestingly, the chlamydiae, like 
the planctomyces group, also have cell walls that lack 
peptidoglycan (7, 59, 110). 

Radiation-Resistant Micrococci and Their Relatives 

Until recently, radiation-resistant micrococci and their 
relatives was known to include only a few closely related 
species of radioresistant bacteria, i.e., Deinococcus radio- 
durans and its relatives (18, 56). However, it has now been 
shown to include the ubiquitous hot spring organism 
Thermus aquaticus as well (82). The signature shown for the 
phylum in Table 3 is rather weak. However, the two se- 



quences now available (Weisburg, Ph.D. thesis; Giovannoni, 
unpublished data) suggest that the group should have a 
strong signature once this can be derived from full sequences 
rather than oligonucleotides. Similarity between the 16S 
rRNA sequences of £>. radiodurans and T. aquaticus is 81% 
(unpublished analysis), low enough to place them in separate 
subdivisions of the same phylum. 

Green Non-Sulfur Bacteria and Their Relatives 

The phylum containing the green non-sulfur bacteria is 
one of those for which little phenotypic justification exists. 
The group contains four known members, the thermophilic 
phototroph Chloroflexus aurantiacus, two mesophilic gliding 
species from the genus Herpetosiphon, and the thermophile 
Thermomicrobium roseum (63 , 94, 162, 170). Chloroflexus 
and the green sulfur bacteria resemble one another in chloro- 
some structure and light-harvesting pigment type (63, 136); 
yet their rRNAs are unrelated (as mentioned above), and the 
structure of their photoreaction centers differs (45, 171). The 
unusual long-chain diols found in Thermomicrobium, func- 
tionally the equivalent of normal glycerol lipids (172), have 
recently been detected in Chloroflexus as well (T. A. 
Langworthy, personal communication), suggesting that a 
convincing phenotypic rationale for the grouping will ulti- 
mately be found. 

Table 3 shows the group to have a fairly distinctive 
signature. The phylum is also characterized by higher-order 
16S rRNA structural idiosyncracies (162), For example, the 
helical element between positions 1126 and 1146, a structure 
found in all other eubacterial sequences, is absent in the 
members of this group (162). 

Although too few species have been characterized to 
project meaningful subdivisions, it would seem that Thermo- 
microbium represents one such and Chloroflexus and the 
Herpetosiphon species represent another (162). (Sequence 
homology between Thermomicrobium and the other species 
is a relatively low 77%.) 

Other Eubacterial Phyla 

The 10 phyla described above account for almost all of the 
eubacterial species whose 16S rRNA have been cataloged or 
sequenced. Since the characterized strains are broadly rep- 
resentative of the known eubacteria, it might seem that few 
additional eubacterial phyla, if any, will be encountered. 
However, isolated rRNA sequences from several unusual 
eubacteria suggest that such is not the case, that many 
eubacterial phyla remain to be described, representing spe- 
cies yet to be discovered. 

The two small clouds on this horizon, the two eubacteria 
whose 16S rRNA sequences do not belong to any of the 
above 10 phyla, are thermophiles noted for their unusual 
lipids. One is Thermodesulfotobacterium commune (unpub- 
lished catalog), a eubacterium having ether-linked lipids 
(119), while the other, Thermotoga maritima, has unique 
lipids that have so far defied complete characterization (89). 
The 16S rRNA sequence from 7*. commune, which is incom- 
plete, will not be treated in this review; that of Thermotoga 
maritima (1), however, will play a key role in the subsequent 
discussion. 

Overall Structure of the Eubacterial Tree 

The definition of the eubacterial phyla brings us to the 
limit of the rRNA cataloging approach. Unfortunately, this is 
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FIG. 11. Eubacterial phylogenetic tree based upon 16S rRNA sequence comparisons. An alignment was constructed (260) from orie 
representative sequence from each of the eubacterial phyla together with an archaebacterial consensus sequence (263). Using (only) those 
positions represented in all sequences in the alignment, a (corrected) evolutionary distance matrix was generated (99), from which a distance 
tree was constructed (36). Branch lengths on the tree are proportional to calculated distances. The sequences used are: Thermotoga maritima 
(1); green non-sulfur bacteria, Thermomicrobium roseum (162); deinococci and relatives, Deinococcus radiodurans (Weisburg, Ph.D. thesis); 
spirochetes, Leptonema iltini (Oyaizu, unpublished data); green sulfur bacteria, Chtorobium vibrioforme (Weisburg, Ph.D. thesis); 
bacteroides-flavobacteria, Flavobacterium heparinum (234); plane tomyces and relatives, Planctomyces staleyi (Oyaizu, unpublished data); 
chlamydiae, Chlamydia psittaci (233); gram-positive bacteria, Bacillus subtilis (68); cyanobacteria, Anacystis nidulans (224); purple bacteria, 
Desuifovibrio desulfuricans (163). For those phyla in which additional 16S rRNA sequences are available, the known sequence depth of the 
group has been (separately) calculated and is indicated by the shaded wedges. Additional sequences added to the alignment when calculating 
these depths are as follows: green non-sulfur bacteria, Chtoroftexus aurantiacus (162); bacteroides-flavobacteria, Bacteroides fragilis (234); 
gram positive bacteria, Heliobacterium chlorum (255); purple bacteria, Agrobacterium tumefaciens (274); and Escherichia coli (19). 



the point at which the study of bacterial evolution starts to 
become interesting, for what we really want to know is the 
phylogenetic relationships among the phyla. With the aid of 
full 16S rRNA sequencing, it has become possible to resolve 
some of these branching orders, and so we can now begin to 
see the progression of eubacterial evolution. 

Figure 11 shows the full eubacterial tree as it is presently 
known. This is a distance matrix tree (36), whose root has 
been subsequently determined by using an archaebacterial 
consensus sequence as an outgroup (1). We shall discuss this 
tree and its implications in detail below, but two important 
points should be noted here. First, nine of the ten phyla 
described above, i.e., all except the green non-sulfur bacte- 
ria and relatives, appear to stem from roughly the same small 
region of the tree. Second, the present root of the tree 
separates all ten of the recognized phyla from the single 
species Thermotoga maritima (1). The microbiologist's at- 



tention to date seems to have been confined to what 
phylogenetically is a nonrepresentative, sampling of the 
eubacteria: Thermotoga would appear to represent a vast 
unexplored * 'other world" of eubacteria, thermophilic orga- 
nisms that (as their unique lipids suggest) almost certainly 
possess a variety of unusual and interesting biochemical 
properties and other characteristics. 

It is important to stress that the root of the eubacterial tree 
shown in Fig. 11 is not an artifact of the treeing procedure. 
The same root placement results from distance-matrix and 
parsimony analyses. (Parsimony analysis on alignments of 
four sequences that comprise an outgroup sequence, the 
Thermotoga sequence, and all possible combinations of two 
other eubacteria] sequences, show T. maritima in every case 
to cluster with the outgroup sequence, to the exclusion of the 
two other eubacteria [1].) 

Sequence distance measurements (Fig. 5 and 11) show a 
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remarkable closeness of the 7\ mariiima sequence to 
outgroup sequences, suggesting that its lineage is the most 
slowly evolving of all eubacterial lineages. 

With the exception of the green non-sulfur group, all of the 
lineages branch from the eubacterial tree in such close 
proximity that their order of branching, the specific relation- 
ships among them, has yet to be convincingly determined. 
However, a few tentative relationships between various 
phyla are suggested and should be mentioned. A specific 
relationship may exist between the cyanobacteria and the 
gram-positive bacteria (Fig. 11). As we have seen, one 
signature position, 1207, in Table 3 supports this conjecture, 
for its composition is constant across all 150 or so gram- 
positive catalogs and all but one of the 30 or so cyanobac- 
terial examples. That photosynthesis is found in both phyla 
and that chlorophyll g of Heliobacterium is most closely 
related in structure to chlorophyll a of the cyanobacteria (17) 
lend additional support to the possible relationship. 

Another "superphytum" suggested in Fig. 11 involves the 
green sulfur bacteria and the bacteroides group. Table 3 
shows several shared signature positions suggestive of that 
grouping, i.e., positions 995 and 1410, and a few higher-order 
structural features in 16S rRNA strengthen the case. The 
helix involving positions 1161 to 1175 (260) is altered in a 
way unique to these two phyla; one nucleotide is deleted 
from the loop in the vicinity of position 1167, while a 
"bulged" nucleotide is inserted in the stalk after position 
1174; the structure in question can be seen in Fig. 7. (The 
deletion alone is seen in several other phyla, and the addition 
alone occurs in a particular subgroup of the a-purple bacteria 
[234].) In both of these phyla the penultimate helix, positions 
1435 to 1466, is strongly truncated, which is rare for 
eubacteria (unpublished analysis). The inclusion of the spi- 
rochetes, plane tomyces, and chlamydiae in this superphy- 
lum is also suggested by the Fig. 11 analysis, but this 
relationship should not be considered seriously without 
additional evidence. 

ARCHAEBACTERIAL PHYLOGENY 

Unusual Nature of the Archaebacterial Phenotype 

Microbiologists have always perceived archaebacteria as 
strange, highly atypical bacteria. Prior to their recognition as 
a phylogenetically coherent grouping (6, 261), however, their 
individual idiosyncrasies were interpreted merely as adapta- 
tions: the lipids of Thermoplasma were unusual because the 
organism evolved to live at high temperatures or in highly 
acidic environments or both (16); the wall of Halococcus 
was an adaptation to an extremely saline environment (123, 
185); the uniqueness of their coenzymes merely reflected the 
capacity of methanogens to produce methane from carbon 
dioxide (277). That different archaebacteria had the same 
unusual lipids was even interpreted as convergent adaptation 
(16)! As we find out more about the archaebacteria our sense 
of their strangeness increases, but its explanation lies in a 
shared ancestry, not in individually evolved idiosyncrasies. 

The archaebacteria as we know them today are a collec- 
tion of disparate phenotypes: the methanogens, the extreme 
halophiles, and the extremely thermophilic sulfur-metabo- 
lizing species (250, 270). Their metabolic differences are 
many, their known similarities few. The methanogens are 
noted for unusual coenzymes (Fig. 12), which tend not to 
occur in other bacteria. The extreme halophiles are the only 
photosynthetic archaebacteria; they transduce light into 
chemical energy by means of a proton pump based on 



bacteriorhodopsin, a pigment unique to this group of orga- 
nisms (122, 217). The halobacteria also possess remarkably 
high intracellular salt concentrations (113). (Some methano- 
gens have impressively high internal salt concentrations as 
well [95], but not in the range characteristic of the 
halophiles.) The extreme thermophiles also have at least one 
unique coenzyme (28, 34) (Fig. 12). They share with 
methanogens a capacity to reduce large amounts of sulfur 
(213). The extreme thermophiles have no immediate known 
relatives that grow at or near normal temperatures; most of 
them grow best at remarkably high temperatures (212, 215). 
Systematic comparisons of archaebacterial metabolism are 
definitely needed. 

The branched-chain, ether-linked lipids common to all 
archaebacteria are found nowhere else in nature (35, 69, 103, 
117-120). In many, but not all species the glycerol diethers 
tend to be covalently joined "head to head,** to produce 
diglycerol tetraethers, which form unusual membranes that 
cannot be freeze-fractured (118). Archaebacteria show an 
impressive number of variations in lipid structure based 
upon the ether-linked, branched-chain theme (35, 69, 
117-119). 

Archaebacteria display their own characteristic version of 
every major macromolecular function, e.g., 16S rRNA (Fig. 
6). However, within these versions an impressive spectrum 
of variation can occur (250). For example, unlike eubacteria, 
their walls are varied in type (101, 109). In their structural 
details some archaebacterial 5S rRNAs resemble somewhat 
the eubacterial form of the molecule and others resemble the 
eucaryotic form, while still others are unique (53, 133, 208). 
All three urkingdoms have a characteristic subunit pattern 
for DNA-dependent RNA polymerases, but within the 
archaebacteria relatively drastic variations in type are seen 
(187, 282). For the extremely thermophilic species and 
Thermoplasma acidophilum, the so-called B subunit is a 
single large molecule, while for methanogens and their 
relatives it consists of two smaller molecules, B' and B" 
(187, 282). In their transfer RNAs (tRNAs) archaebacteria 
show a characteristic pattern of modified bases (70, 71, 73, 
146, 165, 273), yet the tRNAs of the extreme thermophiles 
(except for Thermoplasma) are much more highly modified 
than are those of the methanogens and relatives (250, 259). 

No new major archaebacterial phenotypes were discov- 
ered in the 9 years following the recognition of archaebac- 
teria as a distinct group, which leads to a growing feeling that 
the three basic and highly unique phenotypes, methanogen- 
esis, extreme halophilism, and extremely thermophilic sulfur 
metabolism, are all that exist in the kingdom. (However, see 
below.) 

Definition of the Major Archaebacterial Groups 

The number of archaebacterial species characterized by 
rRNA analysis is only one-tenth the number of characterized 
eubacterial species. Although their number is not large 
enough to provide phylogenetic detail, it is sufficient to 
identify the higher archaebacterial taxa, for the sampling is 
broadly representative. 

rRNA cataloging studies showed that there are three 
major groups of methanogens and one of extreme halophiles 
(5, 52, 54, 56, 250), each being the equivalent of a eubacterial 
phylum. Although the branching order among the four phyla 
could not be determined by this method, the four as a group 
were easily distinguished from the group of extremely 
thermophilic archaebacteria (excluding Thermoplasma). 
However, too few species of extreme thermophiles were 
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characterized by the cataloging method to give a sense of 
that group's phylogeny (225, 259). 

Table 14 lists the archaebacterial species that have been 
characterized by the rRNA method, in an approximate 
phylogenetic ordering. Tables 15 and 16 define the major 
archaebacterial groupings by sequence signature. 

Methanogens. The three methanogen phyla defined by 
rRNA sequence comparisons can also be recognized by 
morphological criteria, with a few exceptions (5, 238). Im- 
munological cross-reactivities almost always identify meth- 
anogen group affiliation as well (135). As a result, we know 
that each of the three main groups of methanogens (currently 
designated orders), Methanobacteriales, Methanococcales, 
and Methanomicrobiales, contains a large number of species 
(5, 98, 135, 238, 239). To date, the immunological studies 
have given no clear indication that any additional major 
methanogen group exists (135). The three methanogen phyla 
are also distinguishable by 5S rRNA type (53). 

In two of the methanogen phyla, subdivisions are recog- 
nized by rRNA cataloging. The Methanobacteriales breaks 
naturally into two "genera," Methanobacterium and Meih- 



anobrevibacter (5, 2J8). A third genus, now represented by 
the lone species Methanothermus fervidus (214), should 
ultimately be declared a distinct unit within this phylum, or 
perhaps a separate phylum (238). The Methanomicrobiales 
in turn divide into two distinct subdivisions, (formally fam- 
ilies) the Methanomicrobiaceae and the Methano- 
sarcinaceae (5, 238). The latter is the most metabolically 
unusual of the methanogen groups. Its species can utilize 
acetate or sometimes methyl amines in methane production; 
some are even unable to use carbon dioxide for this purpose 
(9, 42, 153). These are also the only methanogen species that 
contain cytochromes (b or c or both) (112). The unusual 
halophilic methanogens belong to this group as well (13; I. 
M. Mathrani, D. R. Boone, and R. A. Mah, Abstr. Annu. 
Meet. Am. Soc. Microbiol. 1985, 185, p. 160). 
Extreme halophiles. The extreme halophiles constitute one 



TABLE 14. Archaebacterial subdivisions, representative 
genera, and species" 

Methanococcus group 
Methanococcus vannielii 

M. vottae 

M. maripaiudis 
M . thermotithotrophicus 
M. jannaschit 

Methanobacter group 
Methanobacterium formicicum 
M. bryantii 

M. t he rmoau to trophic urn 
Methanobrevibacter smithii 

M. arboriphitus 

M. ruminantium 
Methanosphaera stadtmaniae 
Methanothermus fervidus 

Methanomicrobium group 
Methanosarcina barkeri 

Methanococcoides methylutens 

Methanothrix soehngenii 
Methanospirillum hungatei 

Methanomicrobium mobile 

Methanomicrobium paynteri 

Methanogenium cariaci 

Methanogenium marisnigri 

Methanoptanus limicola 

Halobactcria 
Halobacterium votcanii 
H. cutirubrum b 
H. sodomense 
H. trypanicum 
Halococcus morrhuae 

Thermoplasma 
Thermoplasma acidophitum 

Therm ococcus group 
Thermococcus ceter 

Extreme thermophiles 
Sulfolobus solfataricus 

S. acidocaldarius 
Thermoproteus tenax 
Desuffurococcus mobitis 
Pyrodictium occultum 

* References 5, 98, 238, and 263. Approximate phylogenetic clustering 
suggested by indentation. 
M6S rRNA identical to those of H. salinarium and H. halobium. 
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of the most distinctive groups of bacteria known. As men- 
tioned, both their very high internal salt concentrations (i.e., 
in the range of 5 M potassium ion [113]) and their mechanism 
for photoproduction of energy are unique in the bacterial 
world. (Bacteriorhodopsin does superficially resemble the 
eucaryotic visual pigment, however, hence its name [122, 
217],) Over ten species of extreme halophiles have been 
characterized in terms of 16S rRNAs, and they form a 
relatively close-knit grouping (56, 72, 90, 125; C. R. Woese 
and G. E. Fox, unpublished data). The unusual halophiles 
that grow under alkaline conditions (222) are among them. 
The group is not a particularly deep one. Based upon relative 
Sab values (Woese and Fox, unpublished data) and known 
sequences, all rRNAs in the group show at least 87% 
sequence similarity. The internal phylogenetic structure of 
the group is unspectacular. 

The extreme halophiles are known to contain cytochromes 
and ferredoxins (78, 113, 121). It has been reported that 
halophile ferredoxin sequences are specifically related to 
those found in cyanobacteria, to the exclusion of other 
eubacteria (80). If such homology exists, it is highly unlikely 
to reflect a genuine phylogenetic relationship; the cyanobac- 
teria and extreme halophiles are not related to one another to 
the exclusion of the other members of their respective 
kingdoms. Sequence convergence also seems unlikely. 
However, gene transfer does not, for the two types of 
organisms can in some cases share the same habitat. 

The cataloging approach failed to distinguish the branch- 
ing order among the three groups of methanogens and the 
extreme halophiles. Most microbiologists tacitly assumed 
that the four phyla were arranged along phenotypic lines; 
i.e., all methanogens clustered together to the exclusion of 
the halophiles. However, oligonucleotide signatures weakly 
suggested a specific relationship between the halophiles and 
the Methanomicrobiales (275). 

Extreme thennophiles. The extreme thermophiles are the 
least characterized, but (as will be seen below) the most 
evolutionarily interesting, of the archaebacteria. The ex- 
treme thermophiles seem quite uniform in phenotype. All 
species grow anaerobically, and most require sulfur as an 
energy source (215). A minority of species can also grow 
aerobicaily, and some that use sulfur as an energy source do 
not require it (215). Most species grow best at extremely high 
temperatures, some near the boiling point of water (212). 

The extreme thermophiles differ from other archae- 
bacteria in numerous ways. Their modes of division tend to 
be unusual and varied (281), as do their viruses (279, 281); 
their ribosomal subunits have an unusual shape (see discus- 
sion below) (81); they have (as mentioned) at least one 
unique coenzyme; they seem to be insensitive to most 
antibiotics (22); and both their tRNAs and rRNAs are highly 
modified (the level in the latter case is fivefold greater than 
that seen in the methanogens and their relatives) (259). 
Nevertheless, in sequence terms, in membrane structure, 
and in most phenotypic characteristics, the extreme 
thermophiles are definitely archaebacteria (141, 142, 215, 
263). 

The phenotypic clustering of extreme thermophiles is 
deceptive. As we shall see, it is not a phylogenetic cluster- 
ing. rRNA cataloging studies showed that Thermoplasma 
was more closely related to the methanogens and relatives 
than to the extreme thermophiles (56). Even its pattern of 
16S rRNA base modification is "methanogen-like" (56, 262). 
However, this potential relationship was at first given little 
credence, and the organism tended to be viewed as an 
atypical extreme thermophile (215). Although an rRNA 



catalog did not exist for Thermococcus, there seemed every 
reason to assume it was related to the other extreme 
thermophiles (215), but, as we shall see, this is not so. 

The Archaebacteria] Tree: Its Branching Order and Root 

An evolutionary distance matrix for the known archae- 
bacterial 16S rRNA sequences is shown in Table \7. A 
phylogenetic tree derived from evolutionary distances is 
shown in Fig. 13 (263). With two minor exceptions, the same 
branching order is given by a number of different methods: 
parsimony analysis, distance-matrix treeing, using subsets of 
the positions in the sequence alignment, etc. (263; unpub- 
lished analysis). One of the exceptions, Sulfolobus 
solfataricus, tends to cluster with Desulfurococcus mobilis 
by parsimony analysis and when no or few outgroups are 
included in a distance treeing analysis; yet when the full set 
of archaebacterial sequences is analyzed by a distance- 
matrix method, D. mobilis clusters instead with Pyrodictium 
occultum, to the exclusion of S. solfataricus. The S. 
solfataricus line of descent is more rapidly evolving than are 
these others (as can be inferred from Table 17), which tends 
to force its branching deeper into the tree than it actually is, 
particularly when outgroup sequences are included in the 
analysis. 

For the same reason the exact branching order for 
Thermoplasma acidophilum, the second exception, is uncer- 
tain, though its lineage always remains in the general vicinity 
of the position shown in Fig. 13. 

Main branches. The root of the archaebacterial tree di- 
vides the urkingdom into two main lineages: a cluster of 
extreme thermophiles on the one hand, and the methanogens 
and their relatives on the other. 

The two main branches differ from each other in several 
interesting ways. For one, the methanogen branch appears 
to be the phylogenetically * 'deeper" of the two. For another, 
the extreme thermophile branch (so far) is phenotypically 
pure, whereas its counterpart is cosmopolitan, including a 
mixture of methanogenic, extremely halophiltc, and ex- 
tremely thermophilic phenotypes. 

TABLE 15. Sequence signature distinguishing the two main 
archaebacterial branches 

Composition in: 



Position" Methanogens Thermoplasma Thermococcus Sulfur-dependent 
and relatives* acidophilum* celtr 4 archaebacteria* 



34 


U 
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-J 
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U 
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559 
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C 
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965 


Y 
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1074 


A 
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1088 


U 


A 


A 


G 


1252 


U 






C 


1351 


U 




C 


C 


1408 


A 






G 



* In 16S rRNA sequence (260). 

* Based upon oligonucleotide catalogs or sequences for 21 species of 
methanogens, and 9 species of halophiles (5. 6. 54, 72. 91, 125; Woese, 
unpublished data). 

* Based upon catalog (262) and an unpublished sequence (R. A. Zimmer- 
mann, personal communication). 

4 Based upon the sequence (Woese et a)., unpublished data). 

* Based upon oligonucleotide catalogs or sequences for five sulfur-depen- 
dent archaebacteria (126, 158, 225; R. Garrett, personal communication; 
Woese ct a)., unpublished data). 

, Same composition as in methanogens and relatives. 

* One example of G at this position (5). 
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TABLE 16. Sequence signatures defining the methanogen and extreme halophile groups" 



Composition in: 
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Extreme halophiles 
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* See footnotes to Table 15 for details. 

* In 16S rRNA sequence. 

r Mco, Melhanococcus, five species (5, 96, 98). 

4 Mba, Methanobacter and relatives, eight species (5, 124; unpublished data). 

* Mmi, Methanospirillum and relatives, eight species (5, 275; unpublished data). 
'Hal, Extreme halophiles, nine species (72 , 91, 125; unpublished data). 

* Tac, Thermoplasma acidophilum (K. M. Cao, H. Ree, D. L. Thurlow, and R. A. Zimmermann, personal communication). 

* Tee, Thermococcus ceter (unpublished data). 
' — , Composition equals major base. 



TABLE 17. Percent similarities and evolutionary distances for archaebacterial 16S rRNAs" 



Species 


Mc. 
van' 
nielli 


M. for- 
micicum 


Ms. hun- 
gatei 


Methanosarcina 
sp. strain 
WH-1 


H. vot- 
canii 


T. acid- 
ophilum 


Tc. celer 


S. solfa- 
taricus 


Tp. 
tenax 


D. mo- 
bills 


P. oc> 
cultum 


Mc. vannielii 




20.3* 


27.7 


26.4 


27.8 


30.0 


19.7 


28.6 


29.0 


24.2 


25.5 


M. formicicum 


69.2 f 




23.0 


25.0 


25.5 


29.3 


20.0 


30.0 


28.0 


24.1 


24.8 


Ms. hungatei 


59.9 


65.7 




19.8 


23.6 
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26.3 


33.6 


32.9 


30.4 


30.8 


Mr. sp. strain WH-1 


61.5 


63.2 


69.8 
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32.3 


30.9 


29.8 


28.5 


H. votcanii 
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62.6 


64.9 


61.6 
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35.6 
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32.2 
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T. acidophilum 
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58.0 


53.1 


57.2 • 


55.5 




26.4 
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Tc. celer 
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61.6 


63.5 


60.5 


61.5 
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S. solfataricus 
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57.2 


53.1 
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52.8 


63.8 




17.1 
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12.8 


Tp. tenax 


58.3 


59.6 


53.9 


56.1 


52.8 


52.0 


69.5 
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14.1 


11.7 


D. mobilis 


64.1 


64.3 


56.7 


57.3 


54.7 


53.4 


70.8 


80.3 


77.7 
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P. occuttum 


62.6 


63.3 


56.2 


58.9 


56.2 


55.0 


72.7 


79.6 


81.2 


88.6 





'Sequence references are cited in Fig. 13 legend. Mc, Methanococcus; M., Methanobacterium; Ms., Methanospirillum; Mr., Methanosarcina; H., 
Halobacterium; T. Thermoplasma; Tc, Thermococcus; S., Suffotobus; Tp., Thermoproteus; D„ Desu(furococcus; P., Pyrodicttum. 

* Upper right numbers are evolutionary distances (99); only positions in alignment represented in all sequences are used in calculation. 

c Lower left numbers are percent similarity; all positions not represented in all sequences and all positions of constant composition were removed from 
consideration. (This last has been done to accentuate the differences among sequences; it does not change their rank order.) 
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FIG. 13. Arc hae bacterial phylogenetic tree based upon 16S rRNA sequence comparisons. The sequences listed were aligned and an 
unrooted distance tree was constructed as in the legend to Fig. 11. Its root was subsequently imposed on the basis of ou (group consensus 
sequences (eubacterial and cucaryotic); the root given by eubacteria or eucaryotes is in the same general region of the tree (i.e., between 
Thermococcus and the other extremely thermophilic species [263], and that shown represents an average of the eubacterial and eucaryotic 
placements. Those sequences used in the alignment are Methanospirittum hungatei (275); the halo bacteria Halobacterium volcanii (72), 
Hahcoccus morrhuae (125), and Halobacterium cutirubrum (91), from left to right in that order; Methanosarcina sp. strain WH-1 (P. 
Rouviere, unpublished data; Woese et al. t unpublished data); halophilic methanogen strain FS-1 (Mathrani et al., Abstr. Annu. Meet. Am. 
Soc. Microbiol. 1985; Woese et al., unpublished data), Methanobacterium thermoautotrophicum (R. Garrett, personal communication) and 
Methanobacterium formicicum (124), from left to right, representing Methanobacterium; Methanococcus vannielii (96); Thermococcus ceter 
(Woese et al., unpublished data); Thermoproteus tenax (126); Fyrodictium occuttum (Woese et al., unpublished data); Sulfolobus solfataricus 
(158); and Desulfurococcus mobilis (Garrett, personal communication). 



Thermococcus and Thermoplasma. The two extreme 
thermophiles Thermococcus and Thermoplasma are unre- 
lated to their phenotypic counterparts. As we have seen, 
their phylogenetic placement with the methanogens is con- 
sistent in the case of Thermoplasma acidophilum with the 
pattern of modified nucleotides in the 16S rRNA, i.e., a low 
level of modifications, at particular sites (5, 262). (Nothing is 
known yet about the pattern of modified nucleotides in 
Thermococcus RNAs.) 

Interestingly, Thermococcus ceter is not closer to the 
methanogens than to the other extreme thermophiles by 
overall sequence distance measure. Table 17 shows its 16S 
rRNA to be closest to that of Pyrodictium occultum. This 
would seem, however, to reflect the relatively slow evolu- 
tionary tempo among extreme thermophiles in general, not a 
specific relationship between these two organisms, a point 
that can be clearly demonstrated by signature analysis, 
which focuses on the more conserved positions in the 
molecule. For example, in a 16S rRNA alignment containing 
sequences from eight methanogens, three extreme halo- 
philes, and four representatives of the extreme thermophile 
branch, there are about 30 positions that have a constant 
composition among the methanogens and extreme halophiles 
but a different (constant) composition among the extreme 
thermophiles. The Thermococcus ceter sequence exhibits 
the characteristic methanogen composition in about 79% of 
these cases; the extreme thermophile composition in only 
11% (263; Woese, unpublished analysis). 

That Thermococcus and Thermoplasma resemble other 
extremely thermophilic archaebacteria in DNA-dependent 
RNA polymerase subunit pattern (187, 282) does not neces- 
sarily prove genealogical relatedness. Zillig and co-workers 
have shown that the large B subunit of RNA polymerase, 



found in all extreme thermophiles, is undoubtedly the ances- 
tral type, for the smaller B' and B" subunits appear to have 
arisen from it at least twice (281). For this reason the 
common occurrence of a large RNA polymerase B subunit 
does not necessarily mean specific relationship. 

The branching of Thermococcus from the main methano- 
gen line of descent is sufficiently deep to suggest that it may 
ultimately be considered to represent a third major archae- 
bacterial lineage. 

Branching order among the methanogens and extreme 
halophiles. Full sequences show that the extreme halophiles 
cluster specifically with the Methanomicrobiales, to the 
exclusion of the other two phyla of methanogens, a relation- 
ship that was hinted at by oligonucleotide signatures (263, 
275). Although this is an unexpected, even counterintuitive, 
finding, the evidence supporting it is entirely convincing. 
The conclusion readily emerges from parsimony analyses of 
16S rRNAs as well as distance treeing (263). For example, in 
an alignment that includes sequences from the three extreme 
halophiles, one from each of the methanogen phyla, and the 
Thermococcus sequence, there are 22 positions having a 
common composition in the halophile and Methanospirillum 
sequences that have a different common composition in the 
remaining three sequences. For any other combination of 
these sequences (keeping the three halophile sequences as a 
unit), there are no more than seven positions of common 
composition defined in this way (263; unpublished analysis). 

Although the relationship between the extreme halophiles 
and the Methanomicrobiales is difficult to justify phenotyp- 
ically, certain facts are consistent with such a grouping. For 
example, methanogens capable of growth at high salt con- 
centrations belong to the Methanomicrobiales (Fig. 13). On 
the methanogen branch of the tree, it is only among species 
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FIG. 14. Eucaryotic phylogenetic tree based upon 16S-like rRNA sequence comparisons. The eucaryotic sequences listed were aligned 
and a distance matrix tree was constructed as in the legend to Fig. 11. The root was determined by including an archaebacterial consensus 
sequence in the alignment. Those eucaryotic sequences used are as follows: micros poridia, Vairimorpha necatrix (226a); flagellates, 
Trypanosoma brucei and Eugtena gracilis (196); slime molds, Dictyostelium discoideum (145); ciliates, Paramecium tetraurelia (195); 
dinoflagellates, Prorocentrum micans (84); fungi, Saccharomyces cereyisiae (179), animals, Xenopus laevis (181); plants, Zea mays (147). The 
archaebacterial (100%) consensus sequence was derived from an alignment of three methanogens and three extreme thermophiles. 



of Methanosarcina, the extreme halophiles, and Thermoplas- 
ma that cytochrome b or c or both are found (78, 86, 112, 
121, 190). 

Figure 13 also indicates a specific relationship between the 
Methanobacteriales and the ha\ophi\c-Methanomicrobiales 
group, to the exclusion of the Methanococcales. This rela- 
tionship, too, can be rationalized by sequence signature 
(263). For example, the Methanobacterium formicicum se- 
quence shares more than twice as many positions of exclu- 
sive common sequence with the MethanospiriUum hungatei 
and extreme halophile sequences as does the Methanococ- 
cus vannieiii sequence (263). 

The position of Thermoplasma in the tree, as mentioned, 
is uncertain. Various treeing procedures place its branching 
in a range that extends from somewhere on the common 
Methanospirillum-halophile branch to just below the Methan- 
ococcus branch (263). A signature marginally suggestive of 
the first placement exists (263), but the exact genealogy of 
the organism should be considered uncertain. 

A BRIEF LOOK AT EUCARYOTE PHYLOGENY 

The biologist generally feels that he has a relatively good 
sense of eucaryotic evolution, and up to a point this is 
certainly true. Detailed taxonomies exist for the various 
classes of animals and plants that rather accurately reflect 
their phylogenies. The higher levels of eucaryote classifica- 
tion are another matter, however. Here our understanding is 
no less a matter of prejudice and preconception than it was 
for the bacteria. The so-called Five Kingdom classification 
(139, 240, 241), plants, animals, fungi, protists, and monera, 
cannot be considered proper phylogeny. It mixes apples and 
oranges and defines categories by exclusion. The system 
gives the same kingdom rank to each of the four groups of 
eucaryotes that it gives to one group of procaryotes 
(monera). Yet it has been obvious for some time that the four 
eucaryotic kingdoms form a phylogenetically coherent unit 



that as a whole ranks with the monera; and monera, of 
course, is now known to comprise two separate kingdoms 
(56). Within the eucaryotes, the protists do not form a 
phylogenetically coherent unit (196). 

Fortunately, this scheme and our criticism of it will soon 
be rendered academic. A new and very different view of 
eucaryotic phylogeny is beginning to emerge, whose outlines 
can be seen in Fig. 14. 

Implications of the Eucaryote Phylogenetic Tree 

What this preliminary phylogeny begins to suggest is that 
the major epochs in eucaryote evolution corresponded to 
major periods in earth history. A relatively "recent** period 
of massive evolutionary radiation appears to have given rise 
to most of the major eucaryotic lineages: green plants, 
animals, fungi, ciliates, dinoflagellates, (some) amoebae, etc. 
(196). (The cellular slime molds represent a slightly earlier 
branching [145, 1%].) It is tempting to equate the onset of 
this particular radiation with a globally significant event and 
attribute the radiation to some major innovative biological 
(evolutionary) response. The hydrosphere is thought to have 
become oxidizing about 1.5 billion years ago (228), which 
time is roughly consistent with the occurrence of the radia- 
tion estimated by back-extrapolation from known time 
points on the eucaryotic tree (196). Biologically this could 
have been a time when an oxygen-utilizing mitochondrion 
developed (274). 

Deeper lineages occur on the eucaryotic tree, however, 
that appear to predate significantly this period of massive 
radiation. A group of flagellates is seen to branch from the 
main eucaryotic stem well before the radiation in question 
(1%), and the microsporidian lineage definitely emerges well 
before that (226a). These earlier branchings should reflect 
earlier phases in the planet* s history. The flagellate branch- 
ing might stem from the "microaerobic** or "amphiaerobic" 
period, i.e., the era between the time the atmosphere be- 
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came significantly oxidizing (over 2 billion years ago) and the 
hydrosphere became aerobic (about 1.5 billion years ago) 
(228). The microsporidian branching would represent the 
still earlier genuinely anaerobic period, before 2.5 or so 
billion years ago (226a). 

Microsporidia, e.g., the genus Vairimorpha, are a group of 
highly unusual and little studied unicellular eucaryotes, 
deserving of more than passing attention. They have primi- 
tive modes of cell division (175) and exhibit strange and 
interesting life cycles, connected with their obligately para- 
sitic mode of existence. The group as a whole parasitizes an 
extremely wide range of other eucaryotes; they have been 
seen to infect examples of all the animal phyla, and they 
even parasitize other protists (197). This could be interpreted 
to mean that their parasitism is of very ancient origin. 
Microsporidia have no mitochondria and, given their deep 
phylogenetic branching, might never have had them. 

Perhaps their molecular idiosyncrasies are the most fasci- 
nating aspect of the microsporidia (and very little is known 
about this). These are the only eucaryotes that have no 5.8S 
rRNA (227). Their rRNAs are also far smaller than normal 
eucaryotic rRNAs. Typically, the eucaryotic small-subunit 
rRNA icomprises about 1,800 nucleotides, but its 
microsporidian counterpart contains under 1,300 nucleo- 
tides, even less than the roughly 1,450 to 1,550 nucleotides 
characteristic of procaryotes (75, 226a). Interestingly, the 
missing areas in the microsporidian small-subunit rRNA 
tend to be those that are unique to the eucaryotes (75) (Fig. 
5). 

With the microsporidian branching we may be near the 
base of the eucaryotic tree, the beginnings of eucaryotic 
cellular evolution. Microsporidia are defined as eucaryotes 
because they have a membrane-delimited nucleus. The ques- 
tion is, in what other ways do they resemble eucaryotes, and 
what characteristic eucaryotic features do they lack? 
Eucaryotes would appear to be an old group, far older than 
many biologists might have thought. Their antiquity would 
seem to rival that of the procaryotic kingdoms. 



NATURE OF THE EVOLUTIONARY PROCESS 
IN BACTERIA 

Relationship Between Evolution's Tempo and Its Mode 

Classical evolutionists recognized that a relationship ex- 
isted between the rate at which evolution proceeds, its 
tempo, and the quality of the changes that occurred, its 
mode (143, 192). Fossil evidence showed that some lineages 
evolve more rapidly than others and that rates of phenotypic 
change vary within lineages at different stages in their 
history (192). Evolution tended to be particularly rapid as a 
lineage came into being and also in some cases as it died out 
(192). The quality of phenotypic change was different during 
such periods of rapid evolution; it was often described as 
drastic and novel, even bizarre (192). 

Two general rules governed the relationship between the 
tempo and mode of metazoan evolution: (i) true evolutionary 
novelty (of the kind that gives rise to major groups) occurred 
only during times of rapid evolution, and (ii) rapid evolution 
tended to be episodic, not chronic. (Evolution of the horse 
was often used as an example. A relatively short period of 
dramatic evolutionary change transformed the common an- 
cestor of horses, tapirs, and rhinoceroses into a horselike 
creature. The evolution of this ancestral horse into the 
modem form, which was a far more protracted affair, 



involved relatively little further change [192].) Two other 
characteristics of rapid evolution are its instability and 
radiation. Of the numerous lineages typically formed from a 
common ancestral stock during these saltatory episodes, 
many, if not most, were short-lived (192). The origin of the 
animal phyla conformed to such a pattern — all seem to have 
burst forth, almost simultaneously (192). The same can now 
be said of the origin of the main eucaryotic kingdoms (Fig. 
14). 

Evolutionists have debated the whys and wherefores of 
the tempo-mode issue for decades. In the past discussion 
centered about whether or not the same evolutionary mech- 
anisms or environmental conditions underlay the chronic 
progressive evolution which characterized normal estab- 
lished lineages, usually referred to as microevolution, that 
underlay rapid episodic evolution, variously called macro- 
evolution, megae volution, or quantum evolution (192). Ini- 
tially, global catastrophes or elevated mutation rates had 
been invoked to explain the radiating, saltatory origin of 
major taxa, and some biologists went so far as to declare 
macroevolution (megaevolution) and microevolution to be 
different in kind (66). However, the idea that catastrophes 
played a necessary or even major role in radical evolutionary 
change was later rejected (192), and elevated mutation rates 
were no longer seen as required for episodic, drastic evolu- 
tionary change; in fact, they could not explain it (192). Rapid 
evolution, macroevolution, was solely the result of ecologi- 
cal considerations, population genetics: small population 
sizes, rapidly changing environments, untoward conditions, 
and the like, were all that need be invoked (143, 192, 272). 
(However, note that we are seeing a return in recent times to 
the global catastrophe type of explanation for radical evolu- 
tionary change; good evidence now supports the idea that 
the effects of a comet or asteroid impact lead to the extinc- 
tion of the dinosaurs, thereby allowing the subsequent 
evolutionary radiation of the mammals [3].) 

Some biologists today seem to feel that microevolution- 
macroe volution is a nonissue, the difference between them 
being only a matter of degree. All distinctions, all bound- 
aries, however, are matters of degree when viewed finely 
enough; this is especially apparent for protracted processes 
such as evolution. The significance of a distinction turns not 
on whether it is a matter of degree, but on how sharp the 
boundary is relative to the space/time scale of the phenom- 
ena that define it. 

One thing is clear about the tempo-mode issue: it will 
never be resolved if its study is confined to fossil evidence. 
In these terms the crucial parameters are too poorly defined 
and distinguished; the phenomena are too illusive and inac- 
cessible and too difficult to explain. It is important, there- 
fore, to try to generalize the tempo-mode problem to bacte- 
rial systems and to study it in molecular terms. 

(In the following discussion I will use the term "macroev- 
olution" to mean the episodic, saltatory, radiating type of 
evolution that can create major taxa and is often associated 
with instability in the newly formed lineages [192]. Although 
this may not be the strict conventional usage of the term, 
there should be no problem as long as its usage is under- 
stood.) 



Macroevolution at the Molecular Level: the Mycoplasmas 
and Their rRNAs 

Changes in molecular sequence are the most basic mani- 
festation of evolution's tempo. Molecular chronometers, as 
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FIG. 15. Phylogcnctic tree for the mycoplasmas and other members of the gram-positive bacteria (see Table 18 footnotes), based upon 16S 
rRNA sequence comparisons. An alignment consisting of the sequences shown was used to construct an unrooted distance matrix tree as in 
the legend to Fig. 11. The root was imposed by using these sequences: Anacystis nidulans (224), Desuffovibrio desulfuricans (163), 
Agribacterium tumefaciens (274), and Escherichia coli (19). 



we have seen, can measure a pure tempo, essentially unaf- 
fected by the mode, the overlying phenotypic changes. 
Unfortunately, sequence distances do not provide as good a 
measure of evolutionary rate as we would like. They provide 
only average rates, over relatively long evolutionary time 
spans, when what is required for a proper formulation of the 
tempo-mode problem in molecular terms are measures of 
changes in rate during the course of a lineage's evolution, as 
are found in the metazoan fossil record. However, as will be 
seen, the "chronometric structure'* of rRNAs (and presum- 
ably other maciromolecules) is such that not only average 
rates, but also indications of rate changes (i.e., peak rates) 
are recorded. The. main problem we appear to face, then, is 
finding the molecular counterpart of that ill-defined quality, 
evolutionary mode, if such exists. This problem too will 
prove tractable, for on the molecular level evolutionary 
tempo and mode are intimately connected; they are essen- 
tially different manifestations of the same process. 

If macroevolution occurs among bacteria, it should be 
most evident in the most rapidly evolving bacterial lineages. 
Two eubacterial lineages in particular are attractive from this 
point of view: the planctomyces group (202) and the 
mycoplasmas. Of these, the mycoplasmas are at least as 
rapidly evolving as the planctomyces. And, what is more 
important for the present discussion, they have known close 
relatives that evolve at normal rates, e.g., Bacillus (262). Our 
discussion will therefore focus on mycoplasmas. 

Mycoplasmas show the main characteristic expected of 
rapidly evolving lineages, an unusual phenotype, the nature 
of which has puzzled microbiologists for decades (176). 
These organisms have no cell walls; the cell membrane is 
their outer boundary. They have a number of cytological and 
biochemical peculiarities, and their genomes are far smaller 
than normal bacterial genomes (137, 176). Some microbiol- 
ogists took their unusual phenotypes to mean that mycoplas- 
mas were extremely primitive, not at all related to ordinary 
bacteria (229); others saw mycoplasmas merely as degener- 



ate forms of certain normal bacteria (176). The current 
consensus among microbiologists, reflected in mycoplasma 
classification, seems to be that these organisms (with the 
exception of Thermoplasma, an archaebacterium) constitute 
a phylogenetically distinct group of highly unusual bacteria 
that is distantly related to the eubacteria (62, 176). 

However, mycoplasmas are not distinctive genealogically. 
By rRNA measure they are merely gram-positive eubacteria. 
They and their relatives, as seen above, reside high in the 
gram-positive tree, as one subline of a particular subgroup in 
the low-G+C subdivision of that phylum (178, 262, 265). 
And, as Fig. 15 shows, the mycoplasmas have specific 
clostridial relatives, for example, Clostridium innocuum 
(262, 265). In other words, mycoplasmas seem unusual not 
because of a remote phylogenetic position, but because their 
mode of evolution has for some reason been atypical. 

Despite their mundane genealogy, the rRNAs of myco- 
plasmas are definitely unique (262 , 265). As Fig. 15 and 
Table 18 show, mycoplasma rRNA sequences change more 
rapidly than do those of normal eubacteria, but more impor- 
tantly, certain positions in the 16S rRNA sequence that tend 
to be invariant in composition, what are normally the 
"phylogenetically uninformative" positions, show signifi- 
cant variation in the mycoplasma sequences (265) (Table 19). 
Since oligonucleotides representing the regions of highly 
conserved sequence make a major contribution to Sab 
values, the latter tend to drop dramatically in rRNAs that 
violate these in variances. Table 18 permits a comparison of 
Sab values to percent sequence similarities for the 
mycoplasmas and some of their normal relatives; the Sab 
values between Mycoplasma gallisepticum and other 
eubacteria in particular are as low as those between normal 
eubacteria and archaebacteria! Varying the normally con- 
served positions seems characteristic of all rapidly evolving 
lineages. As mentioned above, a similar disparity between 
Sab values and percent sequence similarity holds for the 
planctomyces (202). 
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TABLE 18. Percent sequence similarities and S AB values for 16S rRNAs of mycoplasmas and related eubacteria" 



Species 


M. galli- 
septicum 


M . cap- 
ricolum 


S. citri 


Ac. laid- 
iawii 


C. inno- 
cuum 


B. subtilis 


H. chlorum 


Ar. glo- 
biformis 


An. ni- 
dulans 


A. tume- 
faciens 


D. desul- 
furicans 


E. coli 


M. gallisepticum 




27* 


26 


19 


22 


17 


14 


15 


12 


09 


07 


09 


M. capricolum 


79.5 f 




47 


28 


35 


30 


25 


27 


20 


23 


12 


17 


S. citri 


80.5 


87.7 




25 


30 


25 


23 


20 


19 


22 


12 


14 


Ac. laidhwii 


77.4 


81.2 


79.9 




39 


28 


24 


21 


17 


22 


12 


15 


C. innocuum 


76.6 


80.8 


81.0 


82.0 




33 


27 


23 


24 


22 


17 


14 


B. subtilis 


75.0 


81.1 


79.3 


78.6 


82.1 




41 


36 


27 


26 


24 


24 


H. chlorum 


74.5 


78.3 


78.5 


77.9 


79.0 


83.5 




32 


24 


26 


27 


28 


Ar. globiformis 


73.7 


78.1 


77.2 


77.3 


79.5 


81.0 


80.3 




24 


25 


17 


24 


An. nidulans 


73.2 


77.0 


77.6 


76.7 


78.4 


80.5 


81.0 


78.6 




23 


26 


24 


A. tumefaciens 


74.1 


77.2 


76.5 


76.6 


77.1 


79.1 


78.9 


77.6 


77.7 




25 


28 


D. de sulfur icans 


72.2 


75.0 


75.4 


75.0 


76.8 


79.6 


80.3 


78.7 


78.9 


80.9 




33 


E. coli 


71.6 


75.2 


73.9 


73.5 


75.8 


77.5 


78.4 


77.0 


78.0 


79.4 


79.1 





■ References 1, 19, 68, 93, 162, 163, 224, 255. 274; sequences for all members of the mycoplasma group except Mycoplasma capricolum are unpublished, as is 
that for Ar. globiformis, M., Mycoplasma; 5., Spiroptasma; Ac, Acholeplasma; C„ Clostridium; B.. Bacillus; H., Heliobacterium; Ar., Arthrobacter; An., 
Anacystb; A., Agrobacterium; £>.. Desuffovibrio; £.. Escherichia. 

* Upper right are S A3 values (55). 

r Lower left are percent similarities; all positions not represented in all sequences have been eliminated from consideration. 



Dynamics of Variation in rRNA 

We assume that (naturally occurring) changes in rRNA 
sequence are "selectively neutral." The assumption is jus- 
tified on the grounds that translation has to have been one of 
the earliest functions established in the cell (for the evolution 
of alt of the cell's proteins depends upon its existence) and, 
once perfected, there should be no reason, no selective 
constraints, to change it (outside of rare minor adjustments 
made to accommodate changes in the cell's basic physical 
parameters, such as optimum growth temperature, intracel- 
lular pH, or ionic concentrations). To support this case, one 
can invoke the facts that rRNA secondary structures show 
little variation within any of the primary kingdoms (75, 260), 
implying corresponding functional constancy, and that 
within a kingdom components can usually be exchanged 
among different translation systems without destroying func- 
tion (2, 14, 156). The mycoplasmas are not exceptional in 
this regard. The physical parameters of their niches are 
normal; they do not appear to synthesize unusual types of 
proteins; their rRNAs have normal secondary structures 
(93); and some of their close relatives, normal eubacteria in 
all known respects (e.g., Clostridium innocuum, Lactobacil- 
lus cateneforme) share their rRNA sequence idiosyncrasies 
to some extent (265). 

To say that (naturally occurring) changes in rRNA se- 
quence are by and large selectively neutral, does not mean 
that individual base changes are necessarily so; it is the 
overall (composite) change that tends to be independent of 
selection. Changing a single position in a base pair, for 
example, generally creates a mispair, which would probably 
be selected against; to change both members of the pair in a 
way that maintains normal pairing, however, might have 
negligible selective impact. (More complicated arrange- 
ments, tertiary structural and the like, can also be imagined 
wherein three or more positions would have to be changed to 
retain proper function.) The individual mutations creating 
such a composite change, therefore, have to occur simulta- 
neously or at least in fairly rapid succession. 

The most important attributes of rRNA sequence variation 
for our purposes are (i) that recognizable patterns of varia- 
tion exist, i.e., that the same variations tend to occur in 
different phylogenetic groups; and (ii) that the frequency 
with which changes in composition happen can vary widely 
from position to position within the sequence (262, 265). 



Regarding the first point, the pattern of variation at a given 
position is often the same for most or all eubacterial phyla 
and sometimes even for archaebacteria (266). Regarding the 
second, the rates at which the most and least variable 
positions change differ by at least two orders of magnitude 
(265). Both the frequency and pattern of variation seem to 
correlate strongly with the overall structure of the rRNA 
molecule and so would seem to be functionally determined. 
Base pairs at some positions in some helices change fre- 
quently; other pairs in these same helices change rarely if at 
all; some helices are far more variable in composition than 



TABLE 19. Number of positions in which a given sequence 
shows exception to consensus" 



Species 




Consensus sequence 




1* 


r 


3- 


4' 


Mycoplasmas 










1. M. gallisepticum 


66 


50 


44 


99 


2. M, capricolum 


41 


13 


24 


64 


3. S. citri 


46 


18 


23 


60 


A. Ac. laidlawii 


60 


20 


33 


65 


5. C. innocuum 


23 


13 


12 


39 


Normal eubacteria 










6. B. subtilis 


6 


9 


2 


22 


7. H. chlorum 


13 


18 


3 


24 


8. Ar. globiformis 


27 


31 


10 


41 


9. An. nidulans 


28 








10. A. tumefaciens 


23 








11. D. desulfuricans 


17 








12. E. coli 


33 









• References 19, 68. 93, 163, 224, 255, 274; sequences for all members of the 
mycoplasma group except M. capricolum are unpublished, as is that for Ar. 
globiformis. For genera, see Table 18. 

* Consensus allowing one exception only at a given position; the values for 
the species in the mycoplasma group, i.e., no. 1 to 5, are calculated 
individually from alignments containing that sequence alone (from the myco- 
plasma group) and no. 6 to 12; the values shown for no. 6 to 12 are therefore 
averages over the five resulting consensus sequences. 

*" Consensus allowing one exception; alignment contains only gram-positive 
species, i.e., no. 1 to 8. 

4 Consensus allowing no exceptions; alignment contains species 9 to 12 plus 
Leptonema illinf (unpublished data), Chlorobium vibrioforme (Wcisburg, 
Ph.D. thesis), Thermomicrobium roseum (162) and Thermotoga maritima (1); 
i.e., it contains no gram-positive species. 

' Same as footnote d except that one exception is allowed in generating the 
consensus. 
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others; and the composition of loops tends to be more highly 
conserved than that of the underlying double-stranded stalks 
(75, 260, 262, 265). 

Why mycoplasma rRNAs are so unusual. Not only do 
mycoplasmas tend to vary the otherwise conserved positions 
(i.e., introduce the rare composite changes) in rRNAs more 
readily than do their normal counterparts, but they even 
differ significantly from one another in this respect. The 
tendency to vary conserved positions is far more accentu- 
ated in Mycoplasma gallisepticum, for example, than in 
other mycoplasmas (262, 265) (see also Table 20). Moreover, 
one line of mycoplasmas will make changes in the rRNA 
sequence that others do not, as though a broad range of 
possibilities existed from which to choose (262, 265). If their 
ribosomes are structurally and functionally normal, then 
functional constraints on the ribosome cannot bring about 
the evolution of these rRNA idiosyncrasies. One has to 
consider, therefore, that these idiosyncrasies are not con- 
nected to ribosome function/evolution per se; rather, they 
reflect some general peculiarity of the evolutionary process 
in mycoplasmas. 

Consider the following argument: if changing one of the 
more conserved positions in an rRNA has to involve (nearly) 
simultaneous changes elsewhere in the same (or another) 
molecule, then the occurrence of the overall, composite 
change is a higher-order function of the organism's (lin- 
eage's) mutation rate. For low enough rates such changes 
occur with a negligible frequency relative to simple (first- 
order) nucleotide changes, but as the mutation rate in- 
creases, the composite changes will increase in frequency 
relative to the simple ones (265). This means that lineages 
with low mutation rates have associated with them fields of 
rRNA variants (from which they derive their evolutionary 
variability) that are relatively restricted, whereas lines with 
higher mutation rates draw upon much richer (more varied) 
fields of variants. Other factors being equal, rRNAs would 
evolve far more variety in the latter case than in the former. 

Although mycoplasma mutation rates have not actually 
been measured, there are good reasons to suspect them to be 
abnormally high. In any line of descent mutation rate must 
be optimized. It cannot be so high that deleterious mutations 
are created in a significant fraction of an organism's progeny, 
yet it cannot be so low that the lineage is unable to adapt to 
fluctuations in its environment or is otherwise unable to 
compete effectively (265). An upper bound to mutation rate 
is, therefore, set by an organism* s functional genome size. 
The larger that genome, the harder to replicate it without 
introducing errors. Consequently, an organism with a small 
genome could be as stable evolutionary y as organisms with 
larger genomes even though it had a higher mutation rate 
(per base pair). Mycoplasma genomes are four to eight times 
smaller than the eubacterial norm (137). Although 
mycoplasmas arose from ancestors (Clostridia) having nor- 
mal genomes, and so presumably normal mutation rates, the 
constraints keeping mutation rates low cease to exist once 
genome size decreases. One would also expect some 
mycoplasmas to have elevated mutation rates because they 
are known to be deficient in certain DNA repair capacities 
(61) and to lack a DNA polymerase 3'->5' exonuclease 
activity (150). 

Summary. In brief, my argument is this: changes in rRNA 
sequence are for the most part selectively neutral. However, 
many of these changes are composite and so would appear to 
involve nearly simultaneous, coordinate alterations in two or 
more positions in the molecule. Composite change of this 
type is a higher-order function of a lineage's mutation rate. 



Those lines having elevated mutation rates would be ex- 
pected not only to show the normal types of rRNA variants 
at higher than norma) levels, but also to spawn variants that 
normally occur at inappreciable levels. Because their 
genomes are small, mycoplasmas can develop elevated 
mutation rates, generating rRNA variants not usually asso- 
ciated with normal lineages, which gives a unique richness 
(variability) to the evolution of their rRNAs. 

Both Tempo and Mode of Bacterial Evolution Are 
Reflected in rRNA 

A useful, if idealized, model for the rRNA chronometer is 
a measuring device that comprises a series of counters. The 
basic, primary counter records the number of (certain) 
events occurring; it simply measures a distance, a rate x 
time. The others are threshold counters (differing from one 
another in having progressively higher thresholds). Each one 
registers nothing until the rate at which the events occur 
reaches its particular threshold value, after which it too 
measures rate x time. Such a chronometer can measure 
more than long-term average rates, more than simple dis- 
tances. It can detect changes in rate and peak rates. rRNA is 
not the simple uniform-rate chronometer our analyses gen- 
erally assume. It behaves as a "compound' 1 chronometer in 
the above sense. As such it should be able to detect whether 
(but not necessarily when) episodes of rapid evolution have 
occurred in a lineage's history, which opens the tempo-mode 
problem to study on the molecular level. Such a rate- 
sensitive chronometer could also be used to produce 
phylogenetic trees in which the root is internally delimited 
(265). 

The unusual changes encountered in the rRNAs of 
mycoplasmas and other rapidly evolving bacterial lineages 
actually measure the mode of evolution. In a sense they are 
the mode, for at the molecular level tempo and mode come 
together; they are different facets of the same process. The 
mutational changes that are summed to indicate a tempo 
(evolutionary distance) include the changes that define the 
quality of the field of variants. The idiosyncratic selectively 
neutral variants of mycoplasma rRNAs are obviously repre- 
sentative of all variants in the mycoplasma phenotype. 
Expanding the field of variants, as mycoplasmas appear to 
do, makes it statistically unavoidable that their evolution be 
both more rapid than normal and highly atypical. 

Macroevolution in mycoplasmas is chronic. As long as 
mycoplasma genomes remain small, these organisms would 
seem to be in a chronic state of rapid evolution. Classical 
macroevolution is episodic, however. Thus, a question re- 
mains as to whether bacteria can exhibit episodic rapid 
evolution, whether macroevolution in its classical form 
occurs in the microbial world. 

The rRNA chronometer in principle should reveal the 
episodic form of rapid evolution in the same way it does the 
chronic form, although the former would be harder to detect 
in that it would leave less of a trace. That rRNA sequence 
signatures can be constructed for the various bacterial 
groups implies that unusual sequence changes become fixed 
during the formation of major taxa. For a group that contains 
many characterized representatives, the cumulative evolu- 
tionary distance within it (i.e., the sum of all branch lengths 
on the corresponding phylogenetic tree) is very large com- 
pared to the evolutionary distance that separates that 
group's ancestor from the ancestor of some other nearby 
group. Yet (for these well-characterized groups) the unusual 
types of change that become fixed during their formation 
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tend not to occur during the group's evolutionary ramifica- 
tions. No statistical justification of this point will be at- 
tempted at this time, for more rRNA sequence data than now 
exist are required to make a strong case. However, the 
indications are there; major bacterial groups come into being 
through episodic rapid evolution. 

Conditions for true macroevolution among bacteria. Epi- 
sodic rapid evolution among bacteria would (according to 
the above reasoning) require that conditions exist under 
which bacterial mutation rates can increase but subsequently 
return to normal. Such conditions may well be those invoked 
by the classical evolutionist to explain macroevolutionary 
change (192). (However, the classical conditions now be- 
come necessary but insufficient to effect episodic rapid 
evolution in bacteria.) Environments are of two general 
types: those to which organisms can become well adapted, 
and those to which they cannot. Stable or cyclically varying 
environments can be of the first type. Chaotically varying 
and "extreme" environments in general are of the second. 
In the first case an organism's phenotype can (and does) 
become "fine tuned'* to the environment. Organism and 
niche come into some sort of close and detailed correspon- 
dence: nuances of phenotype have selective meaning; addi- 
tional levels of organization (control) are added; efficiency of 
function increases; new refinements, details, evolve. Such 
are the general evolutionary considerations for normal envi- 
ronments. Contrast this to evolution in an unpredictably 
fluctuating or otherwise extreme environment. Environ- 
ments of this type stress the organism's physiological re- 
sponses to the limit. Existence here is a matter of survival 
under any condition, not of survival of the fittest. In other 
words, fine-tuning, efficiency, etc., have little to do with 
evolution in this context. Under these conditions negative 
selection would, in some senses and some areas, be relaxed. 
Many genes concerned with fine-tuning would be of little 
significance. Only the most basic genes, those that make for 
integrity of the organism and continuity of the lineage, would 
really count; and the precision or efficiency (i.e., fine-tuning) 
of their function may not be as selectively significant a 
concern as it normally is. Under conditions such as these, 
when many functions become superfluous, the effective 
genome size is reduced, and some selection is relaxed, a 
lineage might then be able to sustain an elevated mutation 
rate. If so, the resultant expansion of the field of variants 
would mean that unusual phenotypic features necessarily 
arise in the lineage: some of the rare variations might even be 
essential to the line's survival in the untoward environment 
(and so an increased mutation rate would have positive 
selective value). Were such a line subsequently to adapt to a 
more stable, compatible environment (or even somehow 
stabilize in the formerly extreme one), its mutation rate 
would necessarily return to normal, but the organism would 
bear the scars of its tumultuous history; its phenotype would 
be drastically changed and highly unique. 

Basic Principles of Bacterial Evolution 

The above conceptualization of the tempo-mode relation- 
ship makes bacterial evolution appear straightforward and 
understandable. Normal lineages, those having normal mu- 
tation rates, do not drastically change their ancestral pheno- 
type. If the environment in which a phenotype arises (first 
stabilizes) persists, that phenotype will persist, fundamen- 
tally unchanged. This is not to say that the original pheno- 
type cannot change to fit a novel environment (without 
increase in mutation rate and so on), but it is to say that the 



kinds of new environments to which it is capable of adapting 
in this way do not radically alter the ancestral phenotype. On 
the other hand, all drastic (broad-ranging) changes in ances- 
tral phenotype necessarily result from increased mutation 
rates, which tend to occur under unusual, drastic environ- 
mental conditions, when selection is relaxed in ways that 
allow the mutation rate to rise. 

In a general sense, then, the course of bacterial evolution 
is relatively simple to chart by means of a macromolecular 
chronometer, a purely genotypic measure. Lineages repre- 
sented by the shorter branches on a phylogenetic tree (i.e., 
the ones least distant from the tree's root) are slowly 
evolving and retain proportionately more of the ancestral 
phenotype. Lineages represented by the longer branches, on 
the other hand, are rapidly evolving and necessarily retain 
far fewer ancestral characteristics. Moreover, a molecular 
chronometer will show structural idiosyncrasies in the latter 
case, the "scars" of the lineage's bout of rapid evolution. 

Implications for Bacterial Taxonomy 

If future findings support the above conclusions and 
speculations, then it should be possible to construct a 
bacterial taxonomy based upon naturally defined categories. 
Groups that arise through macroevolution are self-defining; 
they are recognizably unique, distinct both phenotypically 
and genotypically (in terms of molecular chronometers). 
Many higher bacterial taxa, i.e., phyla and their major 
subdivisions, show this characteristic. 

In addition to naturally demarcated categories, two other 
requirements for a natural taxonomy are: (i) a means of 
determining relationships among the categories, which, of 
course, is given by the topology of a phylogenetic tree; and 
(ii) a means of naturally defining taxonomic rank, one that is 
not completely dependent upon tree topology. There is no 
reason a priori that a taxon of higher rank, a class, for 
example, cannot be included in one of lower rank, e.g., a 
family. Indeed, the mycoplasmas may be a case in point. The 
question is whether the microbiologist can accept such 
natural "inversions" of rank as taxonomically valid or 
whether he will insist upon an arbitrarily defined taxonomy 
that does not contain them and so appears orderly to him. 

Macroevolutionary episodes can probably be classified by 
the degree of their severity. If the rRNA chronometer in 
essence consists of a series of counters having progressively 
higher thresholds (i.e., the model used above), then the 
highest-threshold counter activated by a macroevolutionary 
episode defines the severity of the episode, which is a de 
facto definition of taxonomic rank. Without further con- 
straints, however, we are again left with the possibility of a 
jumbled taxonomic hierarchy, for the timing and severity of 
macroevolutionary episodes would seem to be unrelated in 
unrelated lineages. 

It is obvious that metazoan taxonomy exhibits more order 
than is inherent in the natural system just described. The 
animal phyla seem all to have arisen at about the same time, 
somewhat less than 1 billion years ago; the eucaryotic 
kingdoms arose similarly at an earlier stage (196). Evolution 
tends to follow two rules: (i) the higher its taxonomic rank, 
the further back in time the taxon arises; and (ii) major taxa 
of the same rank tend to arise in the same era, giving rise to 
evolutionary radiations. All this may seem self-evident to 
some, but it is not, particularly in the bacterial world where 
potential ancestral phenotypes seem to persist for all taxa up 
to the kingdom level. The explanation for this unaccountable 
time ordering of the actual taxonomic hierarchy may lie 
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outside of biology per se, in the evolutionary history of the 
planet. Geologists are coming to the conclusion that the 
history of earth and the other planets, moons, etc., is written 
in terms of catastrophes, relatively sudden, chaotic global 
changes (3). The frequency and severity of such changes 
increase as one looks backward in time. (Unfortunately, the 
record of their happening erodes with time, particularly on 
earth.) The intense bombardment by meteors and similar 
objects, which changed the face of the moon (and presum- 
ably affected earth as well) prior to 3.9 billion years ago, is 
one example (85). Another example, more germane to biol- 
ogy, might be the relatively sudden (on an evolutionary time 
scale, that is) rise in atmospheric oxygen concentration. 
Such precipitate shifts in global parameters would be capa- 
ble of triggering synchronous macroevolutionary episodes in 
various lineages (3). Because the more severe "catastro- 
phes" tend to be the earlier ones, the taxa of higher rank 
would tend to form earlier than those of lower rank. 

The genealogical history of bacteria beginning to be re- 
vealed by molecular chronometers will provide ample data 
to develop or discard such a view of evolutionary relation- 
ships and their taxonomic implications. 

EVOLUTION OF THE TWO PROCARYOTIC 
PHENOTYPES 

The enormous value of bacterial phylogeny as a classifi- 
cation system, a predictive and organizational framework, is 
easier to appreciate than its value as an historical account 
and source of evolutionary insights. This is because our 
evolutionary perspective has to this point in time been 
focused narrowly on metazoa and their fossils. This point of 
view has inevitably overemphasized morphology, concen- 
trating on subtle but superficial differences among complex 
forms. At the same time it has underemphasized the "met- 
abolic" aspect of evolution; it embodies little feeling for 
biochemistry and energy flow. Since it has been confined to 
a relatively recent (and so, in the grand scope, uninteresting) 
period in earth history, well after formation of the oxygen 
atmosphere (180, 228), our present view incorporates in a 
minor way only the dynamism of the evolving earth and the 
close relationship between its physical and biological evolu- 
tion. Overall, this view is a static one, in which evolution is 
not a process but rather "a procession of forms," as 
Whitehead put it (237). 

In contrast, microbial evolution is essentially metabolic, 
fundamentally biochemical. It spans the bulk of our planet's 
history and is intimately tied thereto. As the base of the food 
chains, microbial metabolic patterns bear a straightforward 
and (ultimately) understandable relationship to the planet's 
geochemistry. Thus, bacterial evolution is no simple extrap- 
olation of metazoan evolution — more of the same and, 
lacking fossils, harder to study. Microbial evolution is a 
different story, told in a different way, covering a different 
(more extensive) period of earth history, a story that is 
simpler, more readily interpretable, and more informative of 
the planet's physical course. 

Archaebacterial Evolution 

Ancestral phenotype. It would appear that the ancestral 
archaebacterium was an extremely thermophilic anaerobe 
that probably derived its energy from the reduction of sulfur. 
Two lines of evidence support such a conclusion. The first is 
the widespread distribution of the extreme thermophilic 
phenotype among the archaebacteria. Of the three basic 



archaebacterial phenotypes, it is, as we have seen, the only 
one that occurs on both major branches of the archaebacte- 
rial tree. Thermococcus celer, on the methanogen branch, 
and Pyrodictium occult um, on the extreme thermophile 
branch, are both typical sulfur-metabolizing thermophiles, 
thriving anaerobically in hot spring environments. (Pyrodic- 
tium holds the current record for highest optimum growth 
temperature of any organism, 105°C [212], while Thermo- 
coccus is a common inhabitant of marine hot springs [280].) 
The second is that the extreme thermophile phenotype is the 
only one to meet the tempo-mode criterion for being ances- 
tral; the evolutionary distance between Thermococcus and 
Pyrodictium is remarkably short, about 18%, appreciably 
shorter than the shortest distances, 24 to 25%, that separate 
any methanogen (relatives of T. celer) from P. occultum or 
its relatives, as can be seen in Table 17. From these facts, 
and the fact that the two extreme thermophiles are about 
equally distant from various eubacterial or eucaryotic 
outgroup sequences (263), it follows that both lineages are 
slowly evolving and, therefore, have retained more common 
ancestral characteristics than have the other archaebacterial 
phenotypes. 

Two genera, Suifolobus and Thermoplasma, are atypical 
sulfur-dependent thermophiles in having evolved the capac- 
ity to utilize oxygen (215), almost certainly a derived char- 
acteristic. It is interesting, therefore, that both represent 
relatively rapidly evolving lineages. The lineage of 
Suifolobus is the most rapidly evolving on its branch of the 
archaebacterial tree, while that of Thermoplasma is perhaps 
the most rapidly evolving of all archaebacterial lineages 
(263). 

Evolution of methanogenic and halophilic phenotypes. 
Since these two phenotypes are significantly further from the 
extreme thermophile cluster (by rRNA measure) than is their 
relative Thermococcus, the methanogenic and halophilic 
lineages would seem to have undergone macroe volution. 
Presumably such an episode was associated with the transi- 
tion from an ancestral thermophilic sulfur-metabolizing phe- 
notype to a methanogenic one. Given such an evolutionary 
progression, one has the interesting question of how 
thermophilic sulfur metabolism can change into methano- 
genesis and what global conditions might have favored, i.e., 
brought about such a transition. (Very recently K. O. Stetter 
[personal communication] has isolated what may be a miss- 
ing link in such a transition. The organism is a novel 
archaebacterial phenotype; it grows anaerobically and re- 
duces sulfate. It contain several of the cofactors character- 
istic of the methanogens, but lacks the all-important factor 
coenzyme M, which is involved in the terminal step of 
methane production [238]. Nevertheless, the organism does 
produce methane in minute amounts, as does the eubacterial 
sulfate reducer Desulfovibrio de sulfuric ans [173]. A prelim- 
inary and unpublished partial 16S rRNA sequence shows its 
lineage to arise from the methanogen branch of the 
archaebacteria between the Thermococcus and Methano- 
coccus lineages in Fig. 13.) 

Within the methanogens per se a second round of rapid 
evolution seems to have occurred, involving the branch that 
leads to the Methanomicrobiales . Note in Table 17 that 
sequence distances for members of the Methanomicrobiales 
are greater than corresponding distances for both of the 
other methanogen phyla. This can be seen in convincing 
detail in the analysis of Table 20, which gives sequence 
distances of various archaebacterial species from various 
consensus sequences. 

Although the most spectacular change during this second 
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TABLE 20. Sequence distances (in percentages) of 
arc hae bacterial rRNAs from various consensus sequences 0 



Sequence from: 




Consensus sequence 




1*' 


r 


3' 


4* 


Mc. vannielii 


-J 




10.0 


16.0 


M. formicicum 






9.6 


15.4 


Ms. hungatei 




10.0 


13.9 


19.3 


Mr. sp. strain WH-1 


6.5 


10.5 


13.0 


18.6 


Extreme halophiles' 


8.9 


11.6 


15.6 


20.7 


T. acidophilum 


13.1 


13.0 


15.7 


21.6 


Tc. ceter 


5.1 






10.4 


S. solfataricus 


12.3 


12.1 


5.5 


6.7 


Tp. tenax 


11.5 


10.4 






D. mobilis 


8.9 


8.3 






P. occultum 


9.3 


8.3 


1.1 





• Only those positions represented in all positions in 16S rRNA alignment 
are used in calculation. References 72, 91, 96. 124-126, 158, and 275. 
Sequences for Metkanosarcina sp. strain WH-1, T. acidophilum, Tc. celer, D. 
mobilis, and P. occultum are unpublished. Mc, Methanococcus; M. Methan- 
obacterium; Ms., Methanospuillum; Mr,, Metkanosarcina; T„ Tkermo- 
plasma', Tc, Themococcus; 5., Sutfotobus; Tp., Thermoproteus; £>.. Desul- 
furococcus', P., Pyrodictium. 

* Consensus based upon Mc. vannielii, M. formicicum, and Ms. hungatci. 
e Consensus based upon Mc. vannielii, M. formicicum, and Tc. celer. 

4 Consensus based upon Tc. ceter, Tp. tenax, and D. mobilis. 
' Consensus based upon Tp. tenax, D. mobilis, and P. occultum. 

, Sequence used to generate consensus. 
' Values are averages of the three published halophile sequences and are 
within 7% of one another. 



saltation was the conversion of an anaerobic methanogen 
into an aerobic extreme halophile, profound changes also 
affected the methanogens themselves in this particular lin- 
eage. Some of the Methanomicrobiales became halophilic, 
some even alkaliphilic (13; Mathrani et al. f Abstr. Annu. 
Meet. Am. Soc. Microbiol. 1985). Metkanosarcina and its 
relatives learned to produce methane from acetate or methyl 
amines; they are, as noted, the only methanogens possessing 
cytochrome 6 or c or both (112). The extreme halophiles, 
which require small amounts of oxygen to synthesize their 
carotenoids (79), may represent an evolutionary response to 
the onset of aerobic conditions in the hydrosphere, about 1.5 
billion years ago (228). 

Archaebacterial ribosomes and their evolutionary implica- 
tions. The archaebacteria are unique among the three 
urkirigdoms in that variation in their ribosome type occurs. 
Cammarano and co-workers (21) have measured the protein 
content of ribosomal subunits by buoyant density centrifu- 
gation and find that the molecular weight of protein associ- 
ated with the small subunit in the extreme thermophiles is 
0.64 x 10* to 0.66 x 10 6 but in the extreme halophiles and 
two of the methanogen phyla it is only half this, i.e., 0.31 x 
10 6 to 0.32 x 10 6 . In the remaining methanogen group, the 
Methanococcales, protein contents of the small subunit have 
an intermediate value, 0.52 x 10 6 daltons. A similar situation 
obtains for the large subunit; there the protein contents in 
the extreme thermophiles and Methanococcus are in the 
range of 0.77 x 10* to 0.97 x 10 6 daltons, but in the 
remaining methanogens and the extreme halophiles they are 
much less, i.e., 0.51 x 10* to 0.57 x 10 6 daltons (21). (The 
ribosomes of Thermoplasma are also high in protein; the 
small and large subunits contain 0,61 x 10 6 and 0.78 x 10 6 
daltons of protein, respectively [21].) Since rRNA sizes are 
very nearly the same for all archaebacteria, this approximate 
twofold variation in protein content must make some change 
in the size of the ribosomal subunits, and electron micro- 
graphs bear this out. The large ribosomal subunit from the 



extreme thermophiles is larger than those seen in methano- 
gens; it has several protrusions that its methanogenic (and 
halophilic) counterparts do not (81). However, some of the 
large ribosomal subunits from Methanococcus vannielii have 
these protrusions as well, while the remainder are typical of 
the other methanogens (and extreme halophiles) (218, 219). 

The molecular basis for these large differences in ribo- 
somal protein content is not understood. There is no indica- 
tion that they are connected to structural differences in the 
rRNAs (Woese, unpublished analysis). The excess proteins 
are probably not, therefore, attached directly to rRNA. It 
also seems unlikely that the drastic disparities in protein 
content reflect any significant difference in ribosome func- 
tion in the two classes. 

When these differences in ribosome type were first discov- 
ered (as shape differences in electron micrographs), they 
were interpreted to mean that the archaebacteria were not a 
valid taxon; the extreme thermophiles and the methanogens 
(and their relatives) constituted separate urkingdoms, the 
former having "a close relationship to eukaryotes" (115). 
(For similar reasons, the extreme halophiles were subse- 
quently extracted from the methanogen branch to constitute 
another new kingdom, the "photocytes," which ostensibly 
was specifically related to eubacteria [114].) Now that ribo- 
some morphologies have been more thoroughly investigated, 
it is apparent that ribosome shapes (protein contents) con- 
stitute more a spectrum of types than two clear-cut classes 
(114, 115, 218, 219). In any case, were separate kingdoms to 
be defined along these lines, one group of methanogens, the 
Methanococcales, would end up in a different kingdom (218, 
219) than the others — which is absurd! As taxonomists well 
know and have repeatedly stated, a small number of (ill- 
defined) characters is an unreliable basis upon which to 
define taxa. 

Summary. Archaebacterial evolution can be simply under- 
stood in terms of an aboriginal anaerobic thermophilic 
sulfur-metabolizing phenotype that remained pure in one of 
the urkingdom's two main lineages, but gave rise in the 
other, through several macroe volution episodes, initially to 
methanogenic metabolism and then to an altered, more 
versatile (acetate- or methyl amine-utilizing) form of methano- 
genesis, to halophilic methanogens, and ultimately to the 
aerobic (nonmethanogenic) extreme halophiles. 

Eubacterial Evolution 

Eubacterial history seems less straight-forward than archae- 
bacterial history. The most prominent phenotypic character- 
istics of the eubacterial tree are its metabolic diversity and 
the widespread distribution of anaerobic, photos ynthetic, 
and thermophilic phenotypes. Aerobic and anaerobic group- 
ings stand in sharp contrast to one another. Whereas the 
various anaerobic phenotypes more often than not form 
phylogenetically deep, extensive groupings, such is not the 
case for their aerobic counterparts (56, 206). A good example 
is the comparison between the Clostridia (a remarkably deep 
anaerobic phylogenetic unit) and Bacillus (a phylogenetically 
much shallower collection of aerobes) (56). Thus, although 
aerobic phenotypes evolved a number of times, their lack of 
phylogenetic depth suggests that they are all relatively 
recent in origin. 

Photosynthesis. In slightly different versions, photosynthe- 
sis appears in at least half of the eubacterial phyla. The 
purple, blue-green, gram-positive (Heliobacterium) t and 
green sulfur lineages stem from the same general area on one 
of the two major branches of the eubacterial tree (Fig. 11). 
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The remaining photosynthetic type, the green non-sulfur 
bacteria, represents a significantly deeper branching in the 
tree (162). Within some phyla, e.g., the purple bacteria, the 
photosynthetic phenotype is intimately intermixed with 
nonphotosynthetic phenotypes (162, 267-269). Given the 
complexity of the photosynthetic apparatus, it seems un- 
likely that the process has evolved more than once within the 
eubacteria, and so its origin is deep in the eubacterial tree, 
possibly at the stage of the common eubacterial ancestor. 

Thermophilia. It can be even more compellingly argued 
that thermophilia is an ancestral eubacterial characteristic. 
All deeper branchings in the eubacterial tree involve either 
predominantly or exclusively thermophilic groups (1): the 
Thermotoga lineage forms one side of the deepest known 
branching in the eubacterial line. (The organism so far has 
only one known [unnamed] relative, strain H21 of K. O. 
Stetter, again a thermophile [Stetter and Woese, unpub- 
lished data].) As mentioned above, the unusual thermophile 
Thermodesutfotobacterium commune (119) represents an- 
other very deep branching. A third is formed by Thermo- 
microbium and its relatives (the green non-sulfur bacteria), 
again a predominantly thermophilic group (94, 162, 170). All 
of these thermophilic lineages are also relatively slowly 
evolving, especially Thermotoga, and should, by the tempo- 
mode criterion, be primitive in type. The only known 
mesophile to branch deeply in the eubacterial tree is 
Herpetosiphon (a relative of Chlorofiexus and Thermo- 
microbium). However, this organism is the product of a 
rapid evolution (162) and so should not resemble the ances- 
tral phenotype to the extent that its thermophilic relatives 
do. Thermophiles are also found in the "mesophilic section 11 
of the eubacterial tree, e.g., Bacillus stearothermophilus and 
Thermus aquaticus. In almost all of these cases, too, the 
thermophilic lineages are at least as slowly evolving as their 
specific mesophilic relatives. The evidence clearly points to 
a thermophilic origin of the eubacteria. 

One could similarly, but less convincingly, argue an 
autotrophic ancestry for the eubacteria, for autotrophy is 
also widely distributed among eubacteria. 

Conclusion. The case for a photosynthetic, thermophilic, 
or autotrophic eubacterial ancestor can never be proven in 



the strictest sense. But that is not what is important. 
Because we can now approach microbial phytogeny experi- 
mentally and can construct a microbial tree, we are in a 
position to weed out former incorrect notions and develop 
new and more detailed ones more soundly based upon 
experimental evidence. We are on the threshold of greatly 
expanding our understanding of bacteria and their relation- 
ships to one another — and proceeding from there to a 
reconstruction of the history of life on this planet. 

Differences between Archaebacteria and Eubacteria 

The persistent influence of the procaryote-eucaryote 
dogma prevents many biologists from appreciating the im- 
portance of treating archaebacteria as distinct from 
eubacteria: "They may be very different, but they are still 
procaryotes" is an attitude one still hears repeatedly. Any 
such feeling inhibits understanding of the eubacterial- 
archaebacterial relationship. Even the fact that the names of 
the two groups bear the common suffix "-bacteria" acts to 
do so. Therefore, it is important at this juncture to stress the 
differences between the two procaryotic types. 

CytologicaJ and physiological differences. Eubacteria and 
archaebacteria are perceived as cytologically similar mainly 
because both are simpler than and do not resemble the more 
complexly structured eucaryotic cell. Within their "similar- 
ity," however, lie significant differences in cell architecture 
and metabolic patterns and remarkable differences in evolu- 
tionary behavior. Cells of many extreme halophiles are very 
thin, flat, and straight sided, with precisely square corners 
(216, 230); Fig. 16 shows a remarkable example. Flat pseu- 
do-geometric shapes have also been noted occasionally 
among the methanogens (242). Since flat angular geometries 
are foreign to eubacteria, basic differences between 
archaebacteria and eubacteria in cell architecture are im- 
plied. 

Methanogenesis involves a variety of coenzymes unique 
to that process, and at least one unusual coenzyme is 
somehow associated with sulfur metabolism in the extreme 
thermophiles (Fig. 12). It is hard to believe that such 
uniqueness is entirely confined to specialized biochemistries 
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such as methanogenesis; some of it must carry over into the 
cell's general metabolism. In other words, the unique coen- 
zymes imply significant differences between archaebacterial 
and eubacterial intermediary metabolisms. The area of 
archaebacterial intermediary metabolism is ripe for further 
study (57). 

As mentioned above, archaebacterial membranes may be 
somewhat different structurally from their eubacterial coun- 
terparts, for most of them contain a significant level of 
diglycerol tetraethers, something unprecedented among 
eubacteria and eucaryotes (117, 118). 

Genome organization and control are significantly dif- 
ferent in the two classes of procaryotes; one has the feeling 
that, if more were known, their difference would seem even 
more profound. Some archaebacterial genomes contain fam- 
ilies of repeat sequence DNA, not a eubacterial characteris- 
tic (39, 182). Although the sequences of many eubacterial 
genes are known, introns have not been reported. Yet among 
the few sequenced archaebacterial genes, a number contain 
introns: the majority of sequenced tRNA genes from 
Sulfolobus solfaiaricus contain introns (100; B. P. Kaine, 
manuscript in preparation), as does one tRNA gene from an 
extreme halophile (30) and an archaebacterial 23S rRNA 
gene (107). Nothing so far indicates that archaebacteria use 
the well-known eubacterial types of gene regulation mecha- 
nisms. One would think that, if they did, some evidence for 
their doing so would by now exist. 

Niches. The two types of procaryotes tend to inhabit 
different types of environments. Archaebacteria prefer high- 
temperature niches. One side of the archaebacterial tree 
appears to consist only of thermophilic species. Methano- 
gens that grow at high temperatures are also fairly common 
(214, 278). No major well-characterized eubacterial group is 
known to be exclusively thermophilic, although the Thermo- 
toga side of the eubacterial tree, represented so far by only 
two species, remains potentially so (1, 89). Most eubacterial 
groups are not even predominantly thermophilic. 
Eubacteria, on the other hand, adapt more readily than 
archaebacteria to the myriad low-temperature niches, 
wherein they predominate. 

While eubacteria readily adapt to aerobic conditions, 
archaebacteria seem to have difficulty in doing so. Among 
archaebacteria, even facultative aerobes are relatively un- 
common; there are no obligate aerobes (79). The evolution of 
oxygen utilization in archaebacteria appears to be associated 
with episodes of rapid evolution (see above), which is not the 
case with (at least some) eubacteria; Bacillus, for example, 
although aerobic, represents one of the most slowly evolving 
of eubacterial lines by sequence distance measures (56, 163, 
265). 

Metabolic versatility. The evolutionary differences be- 
tween the two classes of procaryotes manifest themselves in 
several ways. One is an ill-defined evolutionary quality that, 
for want of a better term, will be called "metabolic versatil- 
ity/ 1 Eubacteria are remarkable for the variety of their 
metabolisms, both the overall variety and the extent to 
which variation occurs within some of the phyla and subdi- 
visions. By comparison, archaebacteria are metabolically 
monotonous. Little variation exists within the three basic 
archaebacterial phenotypes; the utilization of acetate and 
methyl amines by one group of methanogens (discussed 
above) is an example of the sort that does occur. While 
photosynthesis is a prevalent, perhaps ancestral, theme 
among eubacteria, no archaebacteria are completely photo- 
synthetic; the extreme halophiles, however, have acquired 
the capacity to use light to run various molecular pumps 



(122, 217). It is remarkable how little metabolic convergence 
the two procaryotic groups have shown over their several 
billion years of coexistence. 

Molecular plasticity. Paradoxically, the archaebacteria 
show more variety than eubacteria do in another evolution- 
ary parameter, a quality we will call "molecular plasticity," 
i.e., the variations in molecular design of a given function 
within a group. That the archaebacteria are exceptional in 
this respect has been apparent from the outset (250). Exam- 
ples of the group's molecular plasticity (mentioned above) 
are seen in 5S rRNA secondary structure (53, 133, 208), 
DNA-dependent RNA polymerase subunit patterns (187, 
282), ribosome protein contents (21), the extent of post- 
transcription modification of bases in rRNAs and tRNAs (70, 
71, 73, 259), cell wall structure (101, 109, 111), types of 
coenzymes (238), and antibiotic sensitivity (22). In all of 
these cases archaebacteria present at least two distinctly 
different types, more often a spectrum of types, while 
eubacteria present a relatively monotonous picture of uni- 
formity (250). 

Rates of evolution. The two classes of procaryotes seem to 
be evolving at generally different rates. Evidence suggesting 
this is seen in the phylogenetic breadth of the two groups, 
i.e., the sequence distance separating the most slowly evolv- 
ing representatives of the deepest branches in each tree. The 
phylogenetic breadth of the archaebacteria (i.e., the distance 
between the extremely thermophilic representatives on its 
two main branches) is under 20%; see Table 17. For 
eubacteria the comparable distance (between Thermotoga 
and Bacillus [1]) is slightly under 30%. Since the root of the 
universal tree is not yet known, it is possible that all this 
means is that the archaebacteria are not as old a group as are 
the eubacteria (attributing the smaller sequence distances in 
this case to shorter time of evolution, rather than a slower 
rate). Such an assumption would require the root of the 
universal tree to be place relatively high on the eubacterial 
branch, which then makes the archaebacteria and 
eucaryotes quite specific relatives of one another and the 
eucaryotes a very rapidly evolving line of descent, assump- 
tions I find intuitively unappealing. 

Summary. This discussion of eubacterial-archaebacterial 
differences amounts briefly to this: although ostensibly sim- 
ilar cytologically, the two groups of procaryotes are signifi- 
cantly different in cellular makeup and in their modes of 
evolution. They tend to inhabit basically different niches. 
They appear to evolve at different rates. They show differ- 
ences in two evolutionary parameters, metabolic versatility 
and molecular plasticity. And, over several billion years of 
coexistence they have shown little or no tendency to con- 
verge in phenotype. It is as though the two have evolved in 
different worlds. 

Why Are Archaebacteria and Eubacteria So Different? 

How are we to understand these substantial differences 
between the two types of procaryotes? Are they a matter of 
different environments during the early stages of evolution in 
the two lines? Are basic organizational differences in the two 
types of cells being reflected in different evolutionary pro- 
clivities? We are in no position to answer such questions 
now. It is not simply that facts are lacking; key concepts 
seem to be missing too. The best we can do now is to ask 
questions that we hope will lead us to experiments that 
provide the required insights. 

One key question seems to be whether archaebacteria are 
significantly older than eubacteria (or vice versa). If 
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archaebacteria already existed before the (most recent) 
common ancestor of all eubacteria arose, then the basic 
archaebacterial phenotype would probably have evolved to 
suit a global environment very different from that in which 
eubacteria later arose, making the two cell types basically 
and perhaps irrevocably dissimilar. 

A second key question is whether the archaebacterial 
ancestor was more primitive than its eubacterial counter* 
part, and if so, in what ways. A more rudimentary, less 
highly integrated ancestor would conceivably have a broader 
spectrum of potential phenotypes into which it might evolve 
than would a more highly integrated, more constrained and 
"advanced" ancestor. An archaebacterial ancestor of this 
type would explain the greater molecular plasticity of the 
group. 

Finally, one wants to know the relationship between the 
common ancestors of both groups (and their relationship to 
the eucaryotic ancestor). What sort of cell, entity, or system 
gave rise to the three urkingdoms? It is unlikely that we will 
ever know the full answer here; however, it is something 
about which we will soon be able to infer a great deal more 
than we can now. 

Primitiveiiess of Archaebacteria 

In the past, discussion of what was and was not a primitive 
characteristic was more or less fruitless. (What appeared 
primitive to some often turned out merely to be degenernate, 
e.g., the mycoplasmas.) This is no longer the case, for 
molecular chronometers provide a nearly certain definition 
of "primitive." In a group of homologous molecular func- 
tions the one whose sequence is closest to that of the 
common ancestral version is necessarily the most primitive. 
Provided the position of its root can be fixed, a sequence tree 
then decides the issue. 

Current evidence tentatively suggests that archaebacteria 
are probably the more primitive of the two procaryotes, in 
three senses of the word: (i) the group as a whole is older 
than the eubacteria; (ii) their common ancestor was a more 
primitive type of entity than the eubacterial common ances- 
tor; and (iii) since they evolve at a slower rate than do the 
eubacteria, archaebacteria today remain more primitive 
(more ancestral in type) than eubacteria. 

The most convincing evidence comes, of course, from the 
rRNA chronometer. Since the archaebacterial 16S rRNA is 
closer in sequence to both its eubacterial and eucaryotic 
counterparts than these two are to one another, the 
archaebacterial version of the molecule must be closer to the 
common ancestral version than is one or both of the other 
versions (72). Placing a root on the (unrooted) universal tree 
anywhere within a zone that includes the archaebacterial 
branch and a fair segment on both of the other main branches 
(which can be visualized on Fig. 4) would make the 
archaebacteria more primitive than both of the others. 

A similar, but weaker, conclusion can be drawn from the 
relationship among the DNA-dependent RN A polymerases 
in the three urkingdoms. By serological cross-reactivity, the 
archaebacterial polymerases appear closer to their eubacte- 
rial and eucaryotic counterparts than these two types are to 
one another (90, 187). (Sequencing studies now in progress 
should soon permit more definite conclusions.) This ten- 
dency for archaebacteria to be closer to both of the other 
groups than they are to one another also carries over (in a 
qualitative sense) to the general phenotype (250). In a few 
years, when sequence evidence is available for a significant 
number of different molecular functions, it may be possible 



to say in a quantitative way that the general archaebacterial 
phenotype is more primitive than at least one of the others. 
If so, the question of archaebacterial primitiveness is half 
answered. 

Knowing the root of the universal tree (the ancestral point) 
would automatically determine which of the three pheno- 
types is the most primitive. Conventional wisdom holds that 
the root of the universal tree cannot be determined, because 
no outgroup exists by which to position it. However, the 
root of the universal tree can be determined, in principle if 
not in practice. What is required is a gene that has duplicated 
in the common ancestor state (as pointed out by M. Dayhoff 
long ago). If two (functionally distinct) versions of such a 
gene fulfilled certain technical requirements, they could then 
be used in effect to determine relative rates of evolution 
within each lineage, thereby fixing the tree's root. A practi- 
cal system for doing this does not yet exist. 

When Did Procaryotes Evolve? 

The general nature of the archaebacterial phenotype 
strongly implies the nature of the environment in which it 
arose. Given the widespread distribution of thermophilic 
species (and their universal occurrence on the extreme 
thermophore branch), it seems impossible that these orga- 
nisms arose from a mesophilic ancestor. The ancestral 
archaebacterium was a thermophile, probably growing at 
temperatures near the present boiling point of water (see 
above discussion). This makes it likely that the archae- 
bacteria arose when the ambient temperature of the planet 
was high, i.e., within the first billion years or so of earth 
history (43, 271). The ancestral archaebacterial environment 
seems also to have been highly reducing, for most 
archaebacteria today are fastidious anaerobes, again impli- 
cating rather early stages in earth history, when both 
hydrosphere and atmosphere would have been reducing 
(228). 

Since stromatolites existed at least 3.5 billion years ago 
(231), eubacteria such as Chloroflexus almost certainly ex- 
isted at that time (162). Given the distribution of thermo- 
philia among eubacteria (see above), these organisms too 
appear to be of ancient and thermophilic origin. However, in 
eubacteria thermophilia is not as extreme as in archae- 
bacteria; finding archaebacteria that grow at the boiling point 
of water is now common, but thermophilic eubacteria grow- 
ing above 90°C are rare. The ancestral eubacterium might 
then have arisen later than the archaebacteria, when the 
planet was somewhat cooler (43, 271). 

The indications are there: archaebacteria seem an ancient 
and primitive phenotype, moreso than the eubacteria. The 
experiments that would convincingly establish the point are 
apparent and some are now being done. Eucaryotes remain 
the puzzle. Were they too of thermophilic origin? How do 
they fit into this scenario developed for the bacteria? 

THE UNIVERSAL ANCESTOR 

All questions concerning relationships among the three 
urkingdoms and the general course of evolution in each 
ultimately turn upon the nature of their common ancestor. 
The nature of this universal ancestor is, in my opinion, 
probably the most important, and definitely the least recog- 
nized, major question in biology today. As we shall see, the 
universal ancestor may have been a kind of entity outside of 
our direct experience. Even to begin considering it, we have 
to question concepts generally taken for granted. 
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Consider how information is organized in various living 
systems. In procaryotes the bulk of the information in the 
cell occurs in very long, contiguous strings of genes; 
procaryotes could then be called "genomic** organisms. 
Lower eucaryotes are also genomic, but the higher ones, 
metazoa with complex development systems, higher ner- 
vous function, and social structures, should probably be 
designated "supi^enomic** entities, for much of the infor- 
mation they contain lies in structures above the level of the 
genome. Entities simpler and more primitive than genomic 
ones must also have existed. An organism of this type could 
have had a genotype and phenotype (i.e. , information stored 
in a quiescent [replicative] form in one class of molecules 
that was also manifested in an active [functional] form in 
another), but its genes would for the most part have been 
physically separate units; they would not be organized into 
large contiguous linear arrays. These less organized systems 
would be called "genetic** but not genomic entities. (The 
reason for distinguishing genetic from genomic entities will 
become apparent as we proceed.) At a still more primitive 
level, entities can be imagined in which the genotype and 
phenotype do not exist in the sense we know them. Instead 
the storage/replicative and the active/functional forms of 
information both reside in the same class of molecule, 
possibly in different configurations of a given molecule. This 
would be the stage of "nucleic acid life** (245-247), where 
translation as we know it has not yet evolved and nucleic 
acids have both genetic and enzymatic functions. The idea 
that there could have been such an early stage has recently 
become quite popular with the experimental demonstration 
that RNAs can have catalytic properties (26, 232). Still more 
primitive stages can be imagined in which the bulk of the 
information in the system is not in macromolecular primary 
structures, but is contained in autocataly tic/metabolic net- 
works. 

Progenotes 

The progenote is a theoretical construct, an entity that, by 
definition, has a rudimentary, imprecise linkage between its 
genotype and phenotype (251, 256). (Extant organisms, 
which have precise, accurate links between genotype and 
phenotype, are then genotes.) The certainty that progenotes 
existed at some early stage in evolution follows from the 
nature of the translation apparatus. Translation is accom- 
plished by a multicomponent (multigene) mechanism that 
includes on the order of 100 different macromolecular spe- 
cies, far too complex a system to have arisen initially fully 
formed. Like the radio, the automobile, and similar devices, 
translation had to evolve through stages, from a much more 
rudimentary mechanism to the present precisely functioning 
one (244-246, 248, 256). Its aboriginal forms had fewer 
components and, consequently, must have functioned less 
accurately than their modern counterparts. There seems no 
alternative to the conclusion that the progenote existed at 
some stage early in evolution. 

Characteristics of the progenote. The limitations of its 
rudimentary translation mechanism ensure that the proge- 
note was a highly unique entity, unlike any life found today. 
Without today's level of accuracy in translation, proteins of 
normal size could not have been synthesized without intro- 
ducing (many) errors. This means the progenote could 
neither have had nor have evolved "modern** proteins (244). 
Its proteins would have been small or of nonunique sequence 
or both. (A collection of polypeptides all different from one 
another, but each an approximate translation of the same 



genetic sequence, is known as a "statistical protein** [244].) 
As a consequence the progenote* s enzymes would not be as 
accurate and specific as their modern counterparts. This in 
turn would delimit the kinds of control mechanisms, the 
definition and number of states the system possessed, and so 
on. Biological specificity at the progenote stage had to have 
been generally lower than now exists (251). 

Replicating a genome places a tremendous burden of 
accuracy on a cell. To reproduce a string of nucleic acid 
thousands of genes long without introducing significant error 
requires an extemely precise mechanism, which today in- 
volves a number of separate activities (proof-reading func- 
tions, error correcting systems, and so on), most of which 
utilize large proteins. Such an extensive, complex, and 
precise system would not be found in a progenote, which 
means that the progenote could not have carried (could not 
have accurately replicated) the number of genes found in 
modern cells (251). A factor of 10 drop in the accuracy 
means a proportionate reduction in the length of the genome. 
The progenote reasonably had error rates two, or even three, 
orders of magnitude greater than found in cells today. (A 
factor of 100 drop in accuracy would leave the mistake rate 
in the range of 1 part per million [monomer units intro- 
duced], which is still impressively accurate.) One therefore 
wonders how progenotes could have carried a sufficient 
number of (different) genes to make them even minimally 
functional cells. 

This apparent paradox can be resolved by making the 
progenote a genetic, not a genomic, entity. Genes would 
then be disjoint, and they could have existed in high copy 
numbers, in which case an appropriately simple mechanism 
can be imagined that would detect errors in individual genes 
and selectively eliminate (not correct) the flawed ones (251). 
As a genetic entity, the progenote could, in spite of a 
relatively very error-prone gene replication process, carry a 
reasonable number of genes. Given disjoint genes that might 
assume functional configurations (251), it is likely that the 
informational macromolecule at this stage was RNA, the 
functional form of nucleic acid today, not DNA (251). 



Was the Universal Ancestor a Progenote? 

In principle the universal ancestor could have resembled 
any one of the three major types of extant organisms. It also 
could have in essence been a collage of all three, or have 
been very unlike any of them. I will argue that the last 
alternative is the correct one and that the universal ancestor 
was a progenote. 

The evolution that transformed the universal ancestor into 
the individual ancestors of each of the three primary king- 
doms was of a unique quality. Sequence distances between 
kingdoms (Fig. 4) seem large compared to the distances 
within kingdoms, this despite the fact that the bulk of 
evolutionary time has involved evolution within the king- 
doms. (The existence of 3.5 billion-year-old stromatolites 
[231] implies the existence of photosynthetic bacteria at that 
time, and so the existence of the common eubacterial 
ancestor even earlier.) Therefore, the long sequence dis- 
tances do not correspond to long times. The transition from 
the universal ancestor to the ancestors of each of the primary 
kingdoms had to have taken less than 1 billion years, perhaps 
far less if an appreciable fraction of earth's first billion years 
involved evolutionary stages that preceded that of the uni- 
versal ancestor. It would seem that the tempo of evolution at 
the time of the universal ancestor was very high. 
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The types of phenotypic changes that accompanied the 
formation of the three primary kingdoms are of a special 
nature. General differences in cell architecture among the 
three groups are remarkable, as are their differences in 
intermediary metabolism, and each kingdom seems to have 
its own unique version of every fundamental cellular func- 
tion: translation, transcription, genome replication and con- 
trol, and so on. The kind of variation that subsequently 
occurred within each of the kingdoms is minor by compari- 
son. Thus the mode of evolution accompanying the transi- 
tion from the universal ancestor is unusual; far more novelty 
arose during formation of the primary kingdoms than during 
the subsequent evolutionary course in any one of them. 

It is hard to avoid concluding that the universal ancestor 
was a very different entity than its descendants. If it were a 
more rudimentary sort of organism, then the tempo of its 
evolution would have been high and the mode of its evolu- 
tion highly varied, greatly expanded. 

Were the actual root of the universal tree (Fig. 4) located 
in the vicinity of the deepest branchings in any one of the 
three primary kingdoms, the above argument concerning 
sequence distances would not apply to that kingdom, which 
makes it conceivable that the universal ancestor had the 
basic phenotype of that group. (This argument is particularly 
attractive as regards the archaebacteria, for the group sits 
relatively close to the intersection of the three primary 
lineages; see Fig. 4.) However, this would still leave the 
problem of deriving the other two phenotypes from a third 
comparably complex one, which entails drastic changes at 
the molecular level in most functions in the cell. In my 
opinion the changes in overall cell structure, organization, 
etc., required to change one of the three phenotypes into 
either of the others are too drastic and disruptive to have 
actually occurred. 

Accepting all this, the only solution to the problem is for 
the universal ancestor to have been a progenote. Since the 
progenote is far simpler and more rudimentary than extant 
organisms, the significant differences in basic molecular 
structures and processes that distinguish the three major 
types of organisms would be attributes that the universal 
ancestor never possessed. In other words, the more rudi- 
mentary versions of a function present in the progenote 
would become refined and augmented independently, and so 
uniquely, in each of its progeny lineages. This independent 
refinement (and augmentation) of a more rudimentary func- 
tion, not the replacement of one complex function by a 
different complex version thereof (the beginning stages of 
which would be strongly selected against), is why remark- 
able differences in detail have evolved for the basic functions 
in each of the urkingdoms. Biological specificity does not 
arise full-blown in cells, and in the transition from the 
universal ancestor to its descendants we are witnessing the 
evolution of biological specificity itself. 

If the universal ancestor were a progenote, a particular 
pattern (spectrum) of relationships would exist among the 
various functions in the three primary kingdoms that would 
be hard to explain otherwise. The progenote lacked most of 
the functions characteristic of cells today, and those it did 
possess existed in a primitive, imprecise form. Therefore, 
functions that were central to the progenote and its descen- 
dants would have undergone the least evolutionary change 
and so would be the most similar in organisms today. The 
translation apparatus is a case in point. Without it no 
genotype-phenotype relationship exists, enzymes as we 
know them cannot evolve, accurate replication of nucleic 
acids is impossible, etc. (251). Translation had therefore to 



be one of the earliest cellular functions to arise. As then 
expected, it is one of the most structurally conserved func- 
tions. Only the fine-tuning aspects of the process appear to 
differ from one kingdom to the next; the replacement of the 
thymidine residue found in eubacterial and eucaryotic 
tRNAs by 1-methyl-pseudouridine in archaebacterial tRNAs 
(73, 165) exemplifies the type of differences encountered. 

Structure developed only in crude, primitive ways in the 
progenote would undergo significant refinement and aug- 
mentation in the descendant lineages. While such functions 
would be homologous in all kingdoms, they would be notice- 
able idiosyncratic and characteristic in each as well. RNA 
polymerase could be an example here (281, 282). Functions 
not present in the progenote would then be totally idiosyn- 
cratic or be analogous, not homologous, in kingdom com- 
parisons. Aspects of genome organization may turn out to be 
examples here, for the progenote did not face the problem of 
organizing thousands of genes, i.e., of developing an ordered 
genome structure. It is also possible that some primitive 
function in the progenote becomes reworked into another 
function(s) in one (or more) of the primary lineages. In this 
case, we might expect to find examples of structural homol- 
ogy without functional homology between kingdoms. Some 
control mechanisms may fall into this class (H. Hartman, 
personal communication). 

The hierarchy of diversification suggested by the proge- 
note, from highly homologous structures, to slightly homol- 
ogous ones, to analogous ones, to idiosyncratic structures, 
should define the order in which the various processes arose 
(or became functionally readapted) during cellular evolution. 
On the consistency of this picture will turn the validity and 
utility of the progenote concept. 

The progenote is today the end of an evolutionary trail that 
starts with fact, progresses through inference, and fades into 
fancy. However, in science endings tend to be beginnings. 
Within a decade we will have before us at least an order of 
magnitude more evolutionary information than we now 
possess and will be able to infer a great deal more with a 
great deal more assurrance than we now can. The root of the 
universal tree will probably have been determined, many 
gene families will have been defined, the evolution of 
genomic organization and of control mechanisms will have 
become serious problems, the enzymatic capacities of RNA 
will be more thoroughly elucidated, and the relationship 
between the evolution of the planet and the life thereon will 
be much better understood. The concepts of the progenote 
and of nucleic acid life will have come into their own. 
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Proposal for a New Hierarchic Classification System, 
Actinobacteria classis nov. 

ERKO STACKEBRANDT,* FRED A. RAINEY, and NAOMI L. WARD-RAINEY 
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A new hierarchic classification structure for the taxa between the taxonomic levels of genus and class is 
proposed for the actinomycete line of descent as defined by analysis of small subunit (16S) rRNA and genes 
coding for this molecule (rDNA). While the traditional circumscription of a genus of the actinomycete 
subphylum is by and large in accord with the 16S rRNA/rDNA-based phylogenetic clustering of these organ- 
isms, most of the higher taxa proposed in the past do not take into account the phylogenetic clustering of 
genera. The rich chemical, morphological and physiological diversity of phylogenetically closely related genera 
makes the description of families and higher taxa so broad that they become meaningless for the description 
of the enclosed taxa. Here we present a classification system in which phylogenetically neighboring taxa at the 
genus level are clustered into families, suborders, orders, subclasses, and a class irrespective of those pheno- 
typic characteristics on which the delineation of taxa has been based in the past. Rather than being based on 
a listing of a wide array of chemotaxonomic, morphological, and physiological properties, the delineation is 
based solely on 16S rDNA/rRNA sequence-based phylogenetic clustering and the presence of taxon-specific 16S 
rDNA/RNA signature nucleotides. 



In their publication "On the nature of global classification," 
Wheelis et al. (177) based the definition of higher taxa on the 
molecular level of universally homologous functions. This 
statement is derived from the high correlation of genealogical 
trees inferred from several such molecules, e.g., genes coding 
for 16S rRNA (16S rDNA) (179), 23S rDNA (96), elongation 
factors involved in the translation process, and the p-subunit of 
ATPase (97). The authors (177) stress that a basic requirement 
of a global classification is uniformity in methods and charac- 
teristics used in defining and ranking taxa. Nonhomologous 
characteristics, on the other hand, may be useful in confirming 
the molecular groupings. Application of this classification 
strategy led to the description of domains for the three highest 
taxa recognized today, the Archaea, Bacteria, and Eucarya 
(180). As a consequence of the description of kingdoms for the 
major lineages within the domain Eucarya (plants, animals, 
fungi, and protozoa), Woese et al. (180) described the two 
main lineages within the domain Archaea as the kingdom Cre- 
narchaeota and the kingdom Euryarchaeota. 

Within the domain Bacteria, more than 15 lineages, which in 
phylogenetic uniqueness and ancestry are comparable to the 
archaeal kingdoms, have been identified. The taxonomic rank 
of kingdom has not yet been proposed for any of these lin- 
eages. The taxon class Proteobacteria has been proposed for a 
phylogenetically broad cluster of gram-negative genera, and 
several orders have been described for some of the phyloge- 
netic lineages that emerged from the comparison of evolution- 
arily conserved macromolecules, e.g.,Aquificales (15), Thermo- 
togales (67), Verrucomicrobiales (173), and Planctomycetales 
(138). These phylogenetically coherent taxa are now used side 
by side with higher taxa that were described at the beginning of 
the pre-molecular era, i.e., before or around 1984. While the 
phylogenetic coherence of the division Firmacutes (53), the 
class Mollicutes, and the orders Chlamydiales, Spirochaetales, 
and Myxobacterales were by and large confirmed following 16S 
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rDNA analyses of their members, the majority of higher taxa 
represent a collection of phylogenetically diverse families and 
genera. Examples are the classes Actinomycetes (81) and Pho- 
tobacteria (53) and the orders Clostridiales and Bacillales (119), 
which need to be redefined in order to make classification 
consistent with phylogenetic structure. 

One of the main lines of descent within the domain Bacteria 
includes a wide range of morphologically diverse organisms, 
most of which, on the basis of a gram-positive staining reaction, 
can be considered members of the division Firmacutes (53). 
This lineage comprises organisms with a DNA base composi- 
tion which generally is above 50 mol% G+C (with a few 
exceptions) and includes representatives of the class Actinomy- 
cetes (81), the orders Actinomycetales (13) and Micrococcales 
(118), the tribes Brevibactereae and Micrococceae (120), and 
several families of the order Actinomycetales as well as addi- 
tional organisms which were identified as members of this 
lineage by phylogenetic analyses. This lineage encompasses a 
wide range of bacteria that irrespective of Gram stain reaction, 
base composition of DNA, morphology, chemotaxonomic 
properties, and other characteristics used to delineate bacterial 
taxa in the past, have a common ancestry (Fig. 1). 

The modern era in the classification of organisms that are 
proposed as members of the class Actinobacteria has its origin 
in three sources: firstly, the establishment of chemotaxonomy 
that detects differences in the chemical composition of cell 
constituents such as peptidoglycan, polar lipids and fatty acids, 
isoprenoid quinones, cytochromes, and the base composition 
of DNA; secondly, the introduction of DNA-DNA reassocia- 
tion experiments that measure the gross similarities between 
single-stranded DNA of strains of closely related species (144); 
and thirdly, the determination of 16S rRNA and rDNA se- 
quence similarities, which reveals the extent of sequence vari- 
ation among strains at all levels of relatedness (148). Each of 
these approaches has contributed to the success of a classifi- 
cation strategy which has been termed polyphasic by Colwell 
(34). 

Although the appropriate methods have been available for 
decades, it took about 30 years to achieve a comprehensive 
overview of the relatedness among actinomycete bacteria that 
would allow a proposal for a unified classification system. The 
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FIG. 1. Dendrogram of 16S rDNA/rRNA relationships showing the phylogenetic position of the proposed class Actinobacteria within the domain Bacteria. 
Sequences oiArclwea were used to root the dendrogram. The recovery of the main lines of descent is as published by Woese (179). The scale bar represents 10 
nucleotide substitutions per 100 nucleotides. 



reasons were manifold. The most obvious one is the late in- 
troduction of 16S rRNA/DNA sequence analysis that, better 
than any other taxonomic method, places an organism in the 
framework of phylogenetic relationships. The restricted infor- 
mation on the degree of relatedness obtained by DNA-DNA 
reassociation studies (143) only allowed analysis of highly re- 
lated species, while elucidation of more remote relationships 
had to wait for analysis of conservative genes. Another reason 
was the tremendous amount of work needed to analyze the 
phylogenetic relatedness of the type strains of several hundred 
species which in the end led to the recognition of those prop- 
erties at the epigenetic level that could be used to unambigu- 
ously circumscribe species at the genus level. 

MATERIALS AND METHODS 

Phylogenetic analyses. As this publication reviews published data, no emphasis 
is placed on methods. Rather we refer to relevant publications, most of which 
contain original or cited data on phylogcny, chemotaxonomy, and other pheno- 
typic data used in the classification of actinomycetes above the species level. 

An alignment of all 16S rRNA/DNA sequences currently available for mem- 
bers of the actinomycete lineage was created by using the ae2 editor (99). The 
sequences included in this alignment were obtained from the Ribosomal Data- 
base Project (99) and the ARB package (a software environment for sequence 
data) (156) as well as our own entries. The 16S rRNA/DNA sequences were 



manually aligned to provide a secondary structure-based optimal alignment. The 
best sequences available were chosen in all cases, but due to the fact that many 
of the actinomycete reference sequences obtained from the databases are partial 
sequences consisting of fewer than 1,300 nucleotides, an ideal data set could not 
be constructed . 

The complete data set used for the analyses described in this study contained 
information on more than 1,000 unambiguous nucleotide positions present in all 
sequences between positions 150 and 1400 {Escherichia coli sequence [11]). 
Individual data sets, which in many cases comprised more than 1,300 nucleotides, 
were used for the determination of the relationships within the groups of the 
proposed taxonomic structure of the actinomycetes. The data set consisting of 
the type strains of the type species of the genera included in this study can be 
requested from the authors. The following genera have either not yet been 
investigated by 16S rDNA sequence analysis and their phylogenetic affiliation 
remains undetermined or their exact phylogenetic position is unclear: Actinocor- 
ailia (68),Actuiokincospora (64), ExccUospora (1), Kincococcus (186), Kincosporia 
(112), and Micropofyspom (93). 

For the reconstruction of the phylogenetic dendrograms, evolutionary dis- 
tances were calculated by the method of Jukes and Cantor (70). Phylogenetic 
dendrograms were reconstructed by using treeing algorithms contained in the 
PHYLIP package (46) and the ARB package (156). The robustness of tree 
topologies was evaluated by bootstrap analyses (45) of the neighbor-joining (136) 
data by performing 1,000 resamplings. The phylogenetic dendrograms presented 
in many cases represent a general overview at the higher- taxon level and do not 
show the individual branching order at the genus and species level. These lowcr- 
taxon branching orders are susceptible to changes depending on the data set used 
in the analysis, but the compositions of taxa above the genus level will not be 
affected. 
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FIG. 2. Proposed hierarchic classification system of the class Actinobacteria based on the phylogenetic analyses of the 16S rDNA/rRNA sequence data. 



RESULTS AND DISCUSSION 

The basis for the proposal of a novel hierarchic structure 
(Fig. 2) for the phylogenetically coherent group of actinomy- 
cete bacteria and relatives is membership of the same phylo- 
genetic group which was formerly described as a 16S rRNA 
subdivision or subphylum of the gram-positive bacteria. The 
common ancestry of the actinomycetes proper and the second 
subdivision of gram-positive bacteria, defined by Clostridia, ba- 
cilli, and their relatives, has not yet been convincingly demon- 
strated by 16S rDNA analyses. However, sequence analyses of 
glutamine synthetase, glutamate dehydrogenase, and the heat 
shock protein HSP70 reveal a common ancestry of the gram- 
positive bacteria (59, 60), The phylogenetic coherence of these 
organisms supports the description of a kingdom for the Fir- 
macutes which would contain two or more classes, one of which 
embraces the actinomycetes proper. If the results of future 
studies were to unambiguously demonstrate the common an- 
cestry of members of the two subphyla they could be united 
under the umbrella of a common higher taxon, the kingdom. 

Membership of a new strain to the class Actinobacteria is 
indicated by 16S rDNA sequence similarity values above 80% 
as determined by comparison of almost-complete 16S rDNA 
sequences of the new strain and the most deeply branching 
members of the class, such as Rubrobacter radiotolerans, Aci- 
dimicrobium ferrooxidans, or Coriobacterium glomerans, and 
the presence of signature nucleotides. We are aware that the 
phylogenetic tree, upon which the conclusions outlined below 



are based, is a mathematical model of how bacterial evolution 
occurs. Signature nucleotides are derivatives of the classifica- 
tion process; i.e., signatures are determined for those organ- 
isms that are contained within a particular data set. It is also 
known that a significant increase in species numbers in any of 
the phylogenetic lineages may lead to a decrease in the number 
of signatures as the 16S rDNA of more slowly or more rapidly 
evolving strains may not contain the signature. It is hoped that 
this proposal will stimulate analyses of other conservative 
genes from organisms that cluster together by 16S rDNA anal- 
yses, so that taxonomic information is provided from addi- 
tional, independently selected genes and properties and the 
hierarchic structure proposed here can be tested. 

The signatures given below for the higher taxa were chosen 
for their presence in more than 95% of the members of the 
respective taxon. The signature pattern for monospecific fam- 
ilies must be considered tentative. It should be mentioned that 
the pattern of signature nucleotides, but not necessarily each 
individual nucleotide, is indicative of the membership of a 
taxon to a higher taxon. Bootstrap values (not shown) were 
determined for the branching points shown in Fig. 3 and were 
higher than 90% in only a few cases, indicating a lack of 
statistical significance of the respective branching points. De- 
spite the finding that the majority of branching points are not 
supported by high bootstrap values, the orders, suborders, and 
families described previously and below are consistently recov- 
erable from phylogenetic analyses. The lack of high bootstrap 
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FIG. 3. Intraclass relatedness of Actinobacteria showing the presence of six orders as well as the 10 suborders of the order Actinomycetales based upon 16S 
rDNA/rRNA sequence comparison. The phylogenetic relatedness of the families of the class Actinobacteria is outlined. The scale bar represents 5 nucleotide 
substitutions per 100 nucleotides. 



values is probably due to the small differences in the primary 
structure of 16S rDNA of members of most of the neighboring 
families. Many of the signature nucleotides and idiosyncrasies 
are located in highly variable and hypervariable regions of the 
16S rDNA molecule, which means that dendrograms of relat- 
edness are mostly derived from nucleotide positions that are 
subject to changes. Also, low bootstrap values are found when 
the radiation of lineages occurs within a short evolutionary 
time span, like those found for the radiation of most suborders 
of the order Actinomycetales. Nevertheless, the separate 
branching of these suborders is found in any of the trees 
generated, irrespective of the algorithm used. This means that 
even if the branching points of taxa above the genus level 
change within the narrow margin at which the higher taxa 
separate from each other, their phylogenetic uniqueness re- 
mains unchanged and their affiliation to the next higher taxon 
will not be greatly affected. As indicated above, the patterns of 
signature nucleotides may need to be emended when analysis 
of novel members of the families indicates the necessity for 
doing so. In this respect, the taxa defined by molecular data 
reflect the most recent state-of-the-art insights into molecular 
systematics. A strategy that could be used in the future to 
decide whether or not a member of a novel genus could be 
considered a member of a known or a novel family and/or any 



other of the defined higher ranks would include (i) generation 
of a high-quality, complete or almost-complete 16S rDNA 
sequence, (ii) proper alignment of the sequence to reference 
sequences of the same quality, (iii) determination of the phy- 
logenetic position, and (iv) checking for the presence of signa- 
ture nucleotides as given below for members of the taxon with 
which the organism groups. If the vast majority of the signature 
nucleotides match those previously defined for the nearest 
neighbors, the organism can be considered a member of this 
taxon. If, however, several signature nucleotides are missing, 
this genus most likely represents a novel family for which a 
novel set of signature nucleotides must be defined. 

The proposal presented here does not change the current 
descriptions of species and genera, which are in most cases 
based upon morphological, chemotaxonomic, and physiologi- 
cal characteristics. These taxa, which provide the working basis 
for taxonomists, have been revised in the past 20 years to 
constitute phylogenetically coherent taxonomic units. In con- 
trast, the descriptions of taxa above the genus level have a 
more broadly based description to incorporate the character- 
istics of the individual genera contained therein. The descrip- 
tion of the family Micromonosporaceae should serve as an 
example (78), and other examples include the families Propi- 
onibacteriaceae and Pseudonocardiaceae. 
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Class Aciinobacteria class is nov., Stackebrandt, Rainey, and 
Ward-Rainey. Aciinobacteria (Ac.ti.no.bac.te'ri.a. Gr. n. actis, 
actinis, a ray, beam; Gr. dim. n. bakterion, a small rod; -ia, 
proposed ending to denote class; Aciinobacteria, actinomycete 
group of bacteria of diverse morphological properties). The 
class is definable in phylogenetic terms as derived from the 
analysis of macromolecules of universally homologous func- 
tions. Strains of the class Aciinobacteria can consistently be 
recovered as members of the same phylogenetic lineage, re- 
vealing >80% 16S rDNA/rRNA sequence similarity among 
each other (Fig. 1), and the presence of the following signature 
nucleotides in the 16S rDNA/rRNA: an A residue at position 
906 and either an A or a C residue at position 955 (except for 
members of the subclasses Rubrobacteridae and Sphaerobacte- 
ridae [which show U residues at these positions]). 

The intraclass relatedness reveals the presence of six phylo- 
genetically distinct lineages which are consistently recovered 
from phylogenetic analyses (42). These lineages are described 
as orders (Fig. 3). The 16S rDNA/rRNA signatures defining 
the higher taxa are based on the available 16S rDNA/rRNA 
sequences of the type strains of type species of the genera. As 
certain taxa contain only one or two species, the pattern of 
signature nucleotides may need to be modified as new species 
are added to the respective genera. 

Subclass Acidimicrobidae subclassis nov., Stackebrandt, 
Rainey, and Ward-Rainey. Acidimicrobidae (A.ci.di.mi.cro. 
bi'dae. M.L. n. Acidimicrobium, type genus of the subclass; 
-idae, ending to denote a subclass; M.L. fern. pi. n. Acidimicro- 
bidae, the Acidimicrobium subclass). The subclass contains the 
type order Acidimicrobiales. The 16S rDNA/rRNA signature 
pattern is as that of the family Acidimicrobiaceae. 

Order Acidimicrobiales ordo. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Acidomicrobiales (A.ci.di.mi.cro.bi.a'les. 
M.L. n. Acidimicrobium, type genus of the order; -ales, ending 
to denote an order; M.L. fern. pi. n. Acidimicrobiales, the Aci- 
dimicrobium order). The order contains the type family Aci- 
dimicrobiaceae. The 16S rDNA/rRNA signature pattern is as 
that of the family Acidimicrobiaceae. 

Family Acidimicrobiaceae fam. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Acidimicrobiaceae (Axi.di.mixro.bi.a'ce.ae. 
M.L. neut. n. Acidimicrobium, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Acidimicrobiaceae, 
the Acidimicrobium family). The following pattern of 16S 
rDNA/rRNA signature nucleotides and nucleotide pairs de- 
fines the family Acidimicrobiaceae: 291-309 (U-A), 294-303 
(U-A), 408-434 (A-U), 670-736 (C-G), 722-733 (A-C), 955- 
1225 (C-G), 1118-1155 (C-G), 1311-1326 (A-U), and 1410- 
1490 (A-U). The family contains the type genus Acidimicro- 
bium (24). Phylogenetic analyses have been published 
previously (17, 23, 149). 

Subclass Rubrobacteridae subclassis nov., Rainey, Ward- 
Rainey, and Stackebrandt. Rubrobacteridae (Ru.bro.bac.te.ri' 
dae. M.L. masc. n. Rubrobacter, type genus of the subclass; 
-idae, ending to denote a subclass; M.L. fern. pi. n. Rubrobac- 
teridae, the Rubrobacter subclass). The subclass contains the 
type order Rubrobacterales. The 16S rDNA/rRNA signature 
pattern is as that of the family Rubrobacteraceae. 

Order Rubrobacterales ordo. nov., Rainey, Ward-Rainey, and 
Stackebrandt. Rubrobacterales (Ru.bro.bac.te.ra'les. M.L. 
masc. n. Rubrobacter, type genus of the order; -ales, ending to 
denote an order; Rubrobacterales, the Rubrobacter order). The 
order contains the type family Rubrobacteraceae. The 16S 
rDNA/rRNA signature pattern is as that of the family Rubro- 
bacteraceae. 

Family Rubrobacteraceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Rubrobacteraceae (Ru.bro.bac.te.ra'ce.ae. 



M.L. masc. n. Rubrobacter, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Rubrobacteraceae, 
the Rubrobacter family). The following pattern of 16S rDNA/ 
rRNA signature nucleotides and nucleotide pairs defines the 
family: 127-234 (G-C), 291-309 (U-A), 657-749 (G-C), 681-709 
(C-G), 941-1342 (A-U), 955-1225 (U-A), 1051-1207 (C-G), 
1115-1185 (C-G), 1311-1326 (A-U), and 1410-1490 (A-U). The 
family contains the genus Rubrobacter (157). A phylogenetic 
analysis has been published previously (17). 

Subclass Coriobacteridae subclassis nov., Stackebrandt, 
Rainey, and Ward-Rainey. Coriobacteridae (Co.ri.o.bac.te.ri' 
dae. M.L. neut. n. Coriobacterium, type genus of the subclass; 
-idae, ending to denote a subclass; M.L. fern. pi. n. Coriobac- 
teridae, the Coriobacterium subclass). The subclass contains the 
type order Coriobacteriales. The 16S rDNA/rRNA signature 
pattern is as that of the family Coriobacteriaceae. 

Order Coriobacteriales ordo. nov., Stackebrandt, Rainey, and 
Ward-Rainey. Coriobacteriales (Co.ri.o.bac.te.ri.a'les. M.L. 
neut. n. Coriobacterium, type genus of the order; -ales, ending 
to denote an order; M.L. fern. pi. n. Coriobacteriales, the Co- 
riobacterium order). The order contains the type family Cori- 
obacteriaceae. The 16S rDNA/rRNA signature pattern is as 
that of the family Coriobacteriaceae. 

Family Coriobacteriaceae fam. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Coriobacteriaceae (Co.ri.o.bac.te.ri.a'ce.ae. 
M.L. neut. n. Coriobacterium, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Coriobacteriaceae, 
the Coriobacterium family). The pattern of 16S rDNA/rRNA 
signature nucleotides of members of the family consists of 
113-314 (C-G), 294-303 (G-C), 295-302 (U-A), 670-736 (G-C), 
771-808 (U-A), 772-807 (A-U), 823-877 (A-U), 941-1342 (A- 
U), 950-1231 (U-G), 1120-1153 (G-C), 1148 (C), 1165-1171 
(C-G), 1242-1295 (G-C), 1313-1324 (G-C), and 1410-1490 (A- 
U). The family contains the type genus Coriobacterium (61) as 
well as Atopobium (26). Phylogenetic analyses have been pub- 
lished previously (128, 146). 

Subclass Sphaerobacteridae subclassis nov., Stackebrandt, 
Rainey, and Ward-Rainey. Sphaerobacteridae (Sphae.ro .bac.te. 
ri'dae. M.L. masc. n. Sphaerobacter, type genus of the subclass; 
-idae, ending to denote a subclass; M.L. fern. pi. n. Sphaerobac- 
teridae, the Sphaerobacter subclass). The subclass contains the 
type order Sphaerobacterales . The 16S rDNA/rRNA signature 
pattern is as that of the family Sphaerobacteraceae. 

Order Sphaerobacterales ordo. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Sphaerobacterales (Sphae.ro. bac.te. ra'les. 
M.L. masc. n. Sphaerobacter, type genus of the order; -ales, 
ending to denote an order; M.L. fern. pi. n. Sphaerobacterales, 
the Sphaerobacter order). The order contains the type family 
Sphaerobacteraceae. The 16S rDNA/rRNA signature pattern is 
as that of the family Sphaerobacteraceae. 

Family Sphaerobacteraceae fam. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Sphaerobacteraceae (Sphae.ro.bac.te.ra' 
ce.ae. M.L. masc. n. Sphaerobacter, type genus of the family; 
-aceae, ending to denote a family; M.L. fern. pi. n. Sphaer- 
obacteraceae, the Sphaerobacter family). The 16S rDNA/rRNA 
signature pattern for the family consists of 291-309 (U-A), 
294-303 (U-A), 408-434 (A-U), 417-426 (C-G), 657-749 (G-C), 
670-736 (G-C), 681-709 (C-G), 941-1342 (A-U), 955-1225 (U- 
A), 1120-1153 (G-C), 1148 (C), and 1351-1371 (C-G). The 
family contains the type genus Sphaerobacter (41). A phyloge- 
netic analysis has been published previously (41). 

Subclass Actinobacteridae subclassis nov., Stackebrandt, 
Rainey, and Ward-Rainey. Actinobacteridae (Ac.ti.no.bac.te.ri' 
dae. M.L. masc. n. Actinomyces, type genus of the subclass; 
•idae, ending to denote a subclass; M.L. fern. pi. n. Actinobac- 
teridae, the Actinomyces subclass). Members of Actinobacteri- 
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dae can consistently be recovered as members of the same 
phylogenetic lineage (42, 48, 133) (Fig. 3). The subclass con- 
tains two orders, Actinomycetaies and Bifidobacteriales. The 
type order is Actinomycetaies Buchanan 1917 (13) emend. 
Members of the subclass contain an insertion of about 100 
bases between helices 54 and 55 within domain III of the 23S 
rDNA (132). This insertion has not been found in members of 
the subclass Coriobacteridae (cf. reference 42). The pattern of 
16S rDNA/rRNA signatures consists of nucleotides at posi- 
tions 291-309 (C-G), 294-303 (C-G), 408-434 (G-C), 670-736 
(A-U), 941-1342 (G-C), 1148 (U), and 1410-1490 (G-C). 

Order Actinomycetaies Buchanan 1917 (13), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Actinomycetaies (Ac.ti.no. 
my.ce.ta'les. M.L. masc. n. Actinomyces, type genus of the 
order; -ales, ending to denote an order; M.L. pi. fern. n. Acti- 
nomycetaies, the, Actinomyces order). The 16S rDNA signature 
pattern consists of nucleotides at positions 122-239 (A-G), 
449 (A), 450-483 (G-C), 823-877 (G-C), and 1118-1155 (U-A). 
The order contains the suborders Actinomycineae, Corynebac- 
terineae, Frankineae, Gly corny cineae, Micrococcineae, Micro- 
monosporineae, Propionibacterineae, Pseudonocardineae, Strep- 
tomycineae, and Streptosporangineae, The type suborder is 
Actinomycineae. 

Suborder Actinomycineae subordo. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Actinomycineae (Ac.ti.no.my.ci'ne.ae. M.L. 
masc. n. Actinomyces, type genus of the suborder; -ineae, end- 
ing to denote a suborder; M.L. fern. pi. n. Actinomycineae , the 
Actinomyces suborder). The 16S rDNA signature pattern is as 
that of the type family Actinomycetaceae. 

Family Actinomycetaceae Buchanan 1918 (14), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Actinomycetaceae (Ac.ti.no. 
my.ce.ta'ce.ae. M.L. masc. n. Actinomyces, type genus of the 
family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Actinomycetaceae, the Actinomyces family). The pattern of 16S 
rDNA signature nucleotides consists of positions 598-640 (U- 
G), 1059-1198 (U-A), and 1061-1195 (G-U). Genera belonging 
to the family are the type genus Actinomyces (63), Mobiluncus 
(141), and Arcanobacterium (32). Phylogenetic analyses have 
been published previously (51, 87, 99, 142). 

Suborder Propionibacterineae subordo. nov., Rainey, Ward- 
Rainey, and Stackebrandt. Propionibacterineae (Pro.pi.on.i. 
bac.te.ri'ne.ae. M.L. neut. n. Propionibacterium, type genus of 
the suborder; -ineae, ending to denote a suborder; M.L. fern, 
pi. n. Propionibacterineae, the Propionibacterium suborder). 
The pattern of 16S rDNA signatures consists of nucleotides at 
positions 127-234 (A-U), 603-635 (A-U), 657-749 (G-C), 671- 
735 (A-U), 986-1219 (U-A), 987-1218 (G-C), 990-1215 (U-G), 
and 1059-1198 (C-G). The suborder contains the type family 
Propionibacteriaceae and the family Nocardioidaceae. 

Family Propionibacteriaceae Delwiche 1957 (40), emend. 
Rainey, Ward-Rainey, and Stackebrandt. Propionibacteriaceae 
(Pro.pi.o.ni.bac.te.ri.a'ce.ae. M.L. neut. n. Propionibacterium, 
type genus of the family; -aceae, ending to denote a family; 
M.L. fern. pi. n. Propionibacteriaceae, the Propionibacterium 
family). The pattern of 16S rDNA signatures consists of nu- 
cleotides at positions 66-103 (A-U), 328 (U), 370-391 (C-G), 
407-435 (C-G), 602-636 (A-U), 658-748 (A-U), 686 (G), 780 
(A), 787 (C), 819 (G), 825-875 (A-U), and 1409-1491 (A-U). 
Genera included in the family are the type genus Propionibac- 
terium (108) as well as Luteococcus (163), Microlunatus (103), 
and Propioniferax (187). Phylogenetic analyses have been pub- 
lished previously (19, 20, 28, 103, 163, 187). 

Family Nocardioidaceae Nesterenko et al. 1985 (104), emend. 
Rainey, Ward-Rainey, and Stackebrandt. Nocardioidaceae 
(Noxar.di.o.i.da'ce.ae. M.L. masc. n. Nocardioses, type genus 
of the family; -aceae, ending to denote a family; M.L. fern. pi. 



n. Nocardioidaceae, the Nocardioides family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 66- 
103 (G-C), 328 (C), 370-391 (G-C), 407-435 (A-U), 602-636 
(G-U), 658-748 (U-A), 686 (U), 780 (G), 787 (A), 819 (U), 
825-875 (G-C), and 1409-1491 (C-G). Genera included in the 
family are the type genus Nocardioides (116) and Aeromicro- 
bium (101). Phylogenetic analyses have been published previ- 
ously (101, 161). 

Suborder Micrococc ineae {Micrococceae Prevot 1961) (120), 
emend. Stackebrandt, Rainey, and Ward-Rainey. Micrococ- 
ceae (Mi.cro.coc.ci'ne.ae. M.L, masc. n. Micrococcus, type ge- 
nus of the suborder; -ineae, ending to denote a suborder; M.L. 
fern. pi. n. Micrococcineae, the Micrococcus suborder). The 
pattern of 16S rDNA signatures consists of nucleotides at 
positions 66-103 (A-U), 70-98 (U-A), 82-87 (G-C), 127-234 
(A-U), 449 (A), 598-640 (U-G), 600-638 (U-G), 722-733 (A- 
A), 952-1229 (C-G), 986-1219 (A-U), 987-1218 (A-U), and 
1059-1198 (U-A). The type family is Micrococcaceae. Other 
families in the suborder include Cellulomonadaceae, Promi- 
cromonosporaceae, Dermatophilaceae, Brevibacteriaceae, Derm- 
abacteraceae, Intrasporangiaceae, Jonesiaceae, and Microbacte- 
riaceae. 

Family Micrococcaceae Pribham 1929 (121), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Micrococcaceae (Mi.cro. 
coc.ca'ce.ae. M.L. masc. n. Micrococcus, type genus of the 
family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Micrococcaceae, the Micrococcus family). The pattern of 16S 
rDNA signatures consists of nucleotides at positions 293-304 
(G-U), 610 (G), 598-640 (U-U), 615-625 (G-C), 839-847 (A- 
U), 859 (U), 1025-1036 (C-G), 1026-1035 (C-G), 1265-1270 
(U-G), and 1278 (U). The family contains the type genus 
Micrococcus (25) as well as the genera Arthrobacter (35; 
emended in reference 77), Kocuria (150), Nesterenkonia (150), 
Renibacterium (137), Rothia (52), and Stomatococcus (7). Phy- 
logenetic analyses have been published previously (76, 150). 

Family Cellulomonadaceae Stackebrandt and Prauser 1991 
(147), emend. Stackebrandt, Rainey, and Ward-Rainey. Cellu- 
lomonadaceae (Cel.lu.lo.mo.na.da'ce.ae. M.L. fern. n. Cellu- 
lomonas, type genus of the family; -aceae, ending to denote a 
family; M.L. fern. pi. n. Cellulomonadaceae, the Cellulomonas 
family). The pattern of 16S rDNA signatures consists of nu- 
cleotides at positions 30-553 (C-G), 100 (C), 183-194 (A-U), 
258-268 (G-C), 610 (A), 615-625 (A-U), 630 (C), 658-748 (G- 
A), 659-746 (C-G), 694 (G), 747 (C), 832-854 (G-U), 859 (C), 
1002-1038 (G-C), 1003-1037 (G-U), 1006-1023 (A-C), and 
1256 (C). The family contains the type genus Cellulomonas (8; 
emended in references 24 and 153) as well as the genera 
Oerskovia (117; emended in reference 90) and Rarobacter 
(182). Phylogenetic analyses have been published previously 
(49, 129, 153). 

Family Promicromonosporaceae fam. nov., Rainey, Ward- 
Rainey, and Stackebrandt. Promicromonosporaceae (Pro. mi. 
cro.mo.no.spo.ra'ce.ae. M.L. fern. n. Promicromonospora, the 
type genus of the family; -aceae, ending to denote a family; 
M.L. fern. pi. n. Promicromonosporaceae, the Promicromono- 
spora family). The pattern of 16S rDNA signatures consists of 
nucleotides at positions 77-92 (G-U), 144-178 (U-G), 183-194 
(C-G), 199-218 (U-U), 381 (A), 602-636 (G-U), 630 (C), 694 
(G), 1002-1038 (Purine-U), 1003-1037 (G-C), 1025-1036 (U- 
A), and 1267 (U). The family contains the type genus Promi- 
cromonospora (82). Phylogenetic analyses have been published 
previously (49, 129). 

Family Dermatophilaceae Austwick 1958 (4), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Dermatophilaceae (Der.- 
ma.to.phi.la'ce.ae. M.L. masc. n. Dermatophilus, type genus of 
the family; -aceae, ending to denote a family; M.L. fern. pi. n. 
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Dermatophilaceae, the Dermatophilus family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 146- 
176 (G-U), 153-168 (G-U), 502-543 (G-C), 546 (G), 580-761 
(U-A), 602-636 (C-G), 615-625 (G-C), 659-746 (U-A), 825-875 
(G-C), 838-848 (Pyr-Pur), and 1251 (G). The family contains 
the type genus Dermatophilus (170) as well as the genera Ky- 
tococcus (150) and Dermacoccus (10). Phylogenetic analyses 
have been published previously (150, 151). 

Family Brevibacteriaceae Breed 1953 (10), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Brevibacteriaceae (Brev.i. 
bac.te.ri.a'ce.ae. M.L. neut. n. Brevibacterium, type genus of 
the family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Brevibacteriaceae, the Brevibacterium family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 41- 
401 (U-A), 69-99 (C-U), 142-221 (U-A), 144-178 (U-G), 407- 
435 (C-G), 586-755 (U-A), 591-648 (G-U), 612-628 (G-C), 
616-624 (C-G), 631 (G), 660-745 (A-U), 670-736 (U-A), 896- 
903 (U-G), 1011-1018 (U-A), 1012-1017 (G-C), 1244-1293 (U- 
A), 1254-1283 (A-C), 1256 (A), 1257 (G), 1262 (A), 1263-1272 
(C-G), 1310-1327 (U-A), and 1442-1460 (U-G). The family 
contains the type genus Brevibacterium (10; emended in refer- 
ence 31). Phylogenetic analyses have been published previ- 
ously (16, 127). 

Family Dermabacteraceae fam. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Dermabacteraceae (Der.ma.bac.te.ra'ce.ae. 
M.L. masc. n. Dermabacter, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Dermabacteraceae, 
the Dermabacter family). The pattern of 16S rDNA signatures 
consists of nucleotides at positions 153-168 (C-G), 248-276 
(U-G), 258-268 (A-U), 280 (U), 407-435 (G-U), 580-761 (U- 
A), 586-755 (U-A), 589-650 (C-G), 602-636 (C-G), 615-625 
(A-U), 838-848 (Pur-Pyr), and 1189 (C). The family contains 
the type genus Dermabacter (69) as well as the genus Brachy- 
bacterium (27). Phylogenetic analyses have been published pre- 
viously (16, 139). 

Family Intrasporangiaceae fam. nov. Rainey, Ward-Rainey, 
and Stackebrandt. Intrasporangiaceae (In.tra.spo.ran.gi.a'ce. 
ae. M.L. neut. n. Intrasporangium, type genus of the family; 
-aceae, ending to denote a family; M.L. fern. pi. n. Intraspor- 
angiaceae, the Intrasporangium family). The pattern of 16S 
rDNA signatures consists of nucleotides at positions 30-553 
(C-G), 69-99 (G-U), 140-223 (G-C), 157-164 (G-C), 258-268 
(A-U), 630 (C), 658-748 (G-U), 659-746 (U-A), 660-745 (G- 
C), 694 (G), 838-848 (C-G), 839-847 (U-A), 859 (C), 1003-1037 
(G-C), 1007-1022 (C-G), 1133-1141 (A-U), and 1134-1140 (C- 
G). The family contains the type genus Intrasporangium (71) as 
well as the genera Sanguibacter (47) and Terrabacter (29). Phy- 
logenetic analyses have been published previously (29, 47). 

Family Jonesiaceae fam. nov., Stackebrandt, Rainey, and 
Ward-Rainey. Jonesiaceae (Jone.si.a'ce.ae. M.L. fern. n. Jone- 
sia, type genus of the family; -aceae, ending to denote a family; 
M.L. fern. pi. n. Jonesiaceae, the Jonesia family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 153- 
168 (C-G), 280 (U), 379-384 (G-C), 407-435 (G-U), 445-489 
(A-U), 589-650 (U-G), 602-636 (U-G), 615-625 (A-U), 668- 
738 (U-A), 838-848 (Pur-Pyr), and 1189 (C). The family con- 
tains the type genus Jonesia (130). Phylogenetic analyses have 
been published previously (49, 129). 

Family Microbacteriaceae Park et al. 1993 (113), emend. 
Rainey, Ward-Rainey, and Stackebrandt. Microbacteriaceae 
(Mi.cro.bac.te.ri.a'ce.ae. M.L. masc. n. Microbacterium, type 
genus of the family; -aceae, ending to denote a family; M.L. 
fern. pi. n. Microbacteriaceae, the Microbacterium family). The 
pattern of 16S rDNA signatures consists of nucleotides at 
positions 45-396 (U-A), 144-178 (C-G), 258-268 (A-U), 497 
(A), 615-625 (A-U), 694 (G), 771-808 (G-C), 839-847 (G-U), 



1256 (C), 1310-1327 (A-U), and 1414-1486 (U-A). The family 
contains the type genus Microbacterium (109) as well as the 
genera Agrococcus (58), Agromyces (54), Aureobacterium (30), 
Clavibacter (39), Curtobacterium (181), and Rathayibacter 
(189). Phylogenetic analyses have been published previously 
(58, 122, 158, 159). 

Suborder Corynebacterineae subordo. nov., Stackebrandt, 
Rainey, and Ward-Rainey. Corynebacterineae (Co.ry.ne.bac.te. 
ri'ne.ae. M.L. masc. n. Corynebacterium, type genus of the 
suborder; -ineae, ending to denote a suborder; M.L. fern. pi. n. 
Corynebacterineae, the Corynebacterium suborder). The pattern 
of 16S rDNA signatures consists of nucleotides at positions 
127-234 (G-C), 131-231 (U-Pur), 502-543 (A-U), 658-748 (A- 
A), 564 (C), 600-638 (G-C), 601-637 (U-G), 660-745 (U-A), 
671-735 (C-G), 819 (G), 952-1229 (U-A), 986-1219 (U-A), 
1116-1184 (U-G), and 1414-1486 (U-G). The suborder con- 
tains the type family Corynebacteriaceae as well as the families 
Dietziaceae, Gordoniaceae, Mycobacteriaceae, Nocardiaceae, 
and Tsukamurellaceae. 

Family Corynebacteriaceae Lehmann and Neumann 1907 
(95), emend. Stackebrandt, Rainey, and Ward-Rainey. Coryne- 
bacteriaceae (Co.ry.ne. bac.te.ri.a'ce.ae. M.L. neut. n. Coryne- 
bacterium, type genus of the family; -aceae, ending to denote a 
family; M.L. fern. pi. n. Corynebacteriaceae, the Corynebacte- 
rium family). The pattern of 16S rDNA signatures consists of 
nucleotides at positions 293-304 (G-U), 307 (A), 316-337 (U- 
G), 468 (U), 508 (U), 586-755 (U-G), 631 (G), 661-744 (G-C), 
662-743 (U-G), 771-808 (A-U), 824-876 (C-G), 825-875 (G-C), 
837-849 (G-U), 843 (C), and 1059-1198 (U-A). The family 
contains the type genus Corynebacterium (94) as well as the 
genus Turicella (50). Phylogenetic analyses have been pub- 
lished previously (50, 114, 135). 

Family Mycobacteriaceae Chester 1897 (21), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Mycobacteriaceae (My.co. 
bac.te.ri.a'ce.ae. M.L. neut. n. Mycobacterium, type genus of 
the family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Mycobacteriaceae, the Mycobacterium family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 70-98 
(A-U), 293-304 (G-U), 307 (C), 328 (U), 614-626 (A-U), 631 
(G), 661-744 (G-C), 824-876 (U-A), 825-875 (A-U), 843 (C), 
and 1122-1151 (A-U). The family contains the type genus My- 
cobacterium (94). Phylogenetic analyses have been published 
previously (115, 131). 

Family Nocardiaceae Castellani and Chalmers 1919 (18), 
emend. Rainey, Ward-Rainey, and Stackebrandt. Nocardi- 
aceae (No.car.di.a'ce.ae. M.L. fern. n. Nocardia type genus of 
the family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Nocardiaceae, the Nocardia family). The pattern of 16S rDNA 
signatures consists of nucleotides at positions 70-98 (U-A), 
139-224 (G-C), 843 (C), 1008-1021 (C-G), 1189 (C), 1244-1293 
(C-G), and 1308-1329 (C-G). The family contains the type 
genus Nocardia (169) as well as the genus Rhodococcus (190). 
Phylogenetic analyses have been published previously (22, 124, 
134). 

Family Gordoniaceae fam. nov., Rainey, Ward-Rainey, and 
Stackebrandt. Gordoniaceae (Gor.do.ni.a'ce.ae. M.L. fern. n. 
Gordonia, type genus of the family; -aceae, ending to denote a 
family; M.L. fern. pi. n. Gordoniaceae, the Gordonia family). 
The pattern of 16S rDNA signatures consists of nucleotides at 
positions 70-98 (A-U), 293-304 (A-U), 307 (U), 661-744 (A- 
U), 824-876 (U-A), 825-875 (A-U), 843 (U), 1002-1038 (A-U), 
1007-1022 (C-G), 1122-1151 (G-C), and 1124-1149 (A-U). The 
family contains the type genus Gordonia (154). Corrigendum: 
The name Gordonia, not Gordona, is proposed as the correct 
etymology. Phylogenetic analyses have been published previ- 
ously (6, 124, 134). 
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Family Tsukamurellaceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Tsukamurellaceae (Tsu.ka.mu.rel.la'ce.ae. 
M.L. fern. n. Tsukamurella, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Tsukamurellaceae, 
the Tsukamurella family). The pattern of 16S rDNA signatures 
consists of nucleotides at positions 70-98 (U-A), 293-304 (G- 
U), 307 (C), 631 (C), 661-744 (G-C), 824-876 (U-A), 825-875 
(A-U), 843 (C), 1007-1022 (G-U), and 1122-1151 (A-U). The 
family contains the type genus Tsukamurella (33). Phylogenetic 
analyses have been published previously (33, 124, 183). 

Family Dietziaceae fam. nov., Rainey, Ward-Rainey, and 
Stackebrandt. Dietziaceae (Diet.zi.a'ce.ae. M.L. fern. n. Diet- 
zia, type genus of the family; -aceae, ending to denote a family. 
M.L. fern. pi. n. Dietziaceae, the Dietzia family). The pattern of 
16S rDNA signatures consists of nucleotides at positions 70-98 
(U-A), 293-304 (G-U), 307 (U), 418-425 (U-A), 508 (U), 614- 
626 (U-G), 631 (G), 661-744 (A-U), 771-808 (A-U), 824-876 
(C-G), 825-875 (G-C), 843 (C), 1049-1198 (U-A), and 1122- 
1151 (A-U). The family contains the type genus Dietzia (125). 
A phylogenetic analysis has been published previously (124). 

Suborder Pseudonocardineae subordo. nov., Stackebrandt, 
Rainey, and Ward-Rainey. Pseudonocardineae (Pseu.do.no. 
car'di.ne.ae. M.L. fern. n. Pseudonocardia, the type genus of 
the suborder; -ineae, ending to denote a suborder; M.L. fern, 
pi. n. Pseudonocardineae, the Pseudonocardia suborder). The 
pattern of 16S rDNA signatures is as that for the family. The 
type family is Pseudonocardiaceae. 

Family Pseudonocardiaceae Warwick et al. 1994 (175), 
emend. Stackebrandt, Rainey, and Ward-Rainey. Pseudono- 
cardiaceae (Pseu.do.no. car. di. a 'ce.ae. M.L. fern. n. Pseudono- 
cardia, type genus of the family; -aceae y ending to denote a 
family; M.L. fern. pi. n. Pseudonocardiaceae, the Pseudonocar- 
dia family). The pattern of 16S rDNA signatures consists of 
nucleotides at positions 127-234 (G-C), 183-194 (G-U), 502- 
543 (A-U), 603-635 (C-G), 610 (A), 747 (A), 952-1229 (U-A), 
986-1219 (U-A), 987-1218 (G-C), 1001-1039 (Pyr-G), and 
1308-1329 (C-G). Comment: Although in phylogenetic terms 
this family is rather broad, it is currently not possible to sub- 
divide the family due to the lack of an unambiguous pattern of 
signature nucleotides. The family contains the type genus 
Pseudonocardia (66) as well as the genera Actinopolyspora (55), 
Actinosynnema (65), Amycolatopsis (92), Kibdelosporangium 
(140), Kutzneria (152), Lentzea (184), Saccharomonospora 
(106), Saccharopolyspora (86), Saccharothrix (84), Strep- 
toalloteichus (168), and Tlxermocrispum (79). Phylogenetic 
analyses have been published previously (9, 43, 74, 79, 175, 
184). 

Suborder Streptomycineae subordo. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Streptomycineae (Strep.to.my.ci'ne.ae. M.L. 
masc. n. Streptomyces, type genus of the suborder; -ineae, end- 
ing to denote a suborder; M.L. fern. pi. n. Streptomycineae, the 
Streptomyces suborder). The pattern of signature nucleotides 
of 16S rDNA is as that of the type family Streptomycetaceae. 

Family Streptomycetaceae Waksman and Henrici 1943 (171), 
emend. Rainey, Ward-Rainey, and Stackebrandt. Streptomyce- 
taceae (Strep.to.my.ce.ta'ce.ae. M.L. masc. n. Streptomyces, 
type genus of the family; -aceae, ending to denote a family; 
M.L. fern. pi. n. Streptomycetaceae, the Streptomyces family). 
The family is emended to exclude the genus Sporichthya. The 
pattern of 16S rDNA signature nucleotides consists of 71 (G), 
80-89 (G-C), 81-88 (C-G), 82-87 (U-G), 127-234 (G-C), 209 
(C), 210 (C), 211 (G), 610 (G), 671-735 (U-A), 819 (G), 837- 
849 (C-G), 950-1231 (U-G), 955-1225 (C-G), 965 (C), 1254- 
1283 (A-U), and 1409-1491 (C-G). The family contains the 
type genus Streptomyces (171; emended in references 176 and 



178). Phylogenetic analyses have been published previously 
(73, 155, 160, 176, 178). 

Suborder Streptosporangineae subordo. nov., Ward-Rainey, 
Rainey, and Stackebrandt. Streptosporangineae (Strep. to. spo. 
ran.gi'ne.ae. M.L. neut. n. Streptosporangium, type genus of the 
suborder; -ineae, ending to denote a suborder; M.L. fern. pi. n. 
Streptosporangineae, the Streptosporangium suborder). The pat- 
tern of 16S rDNA signatures consists of nucleotides at posi- 
tions 127-234 (A-U), 657-749 (G-Pyr), and 955-1225 (C-G). 
The type family is Streptosporangiaceae. 

Family Streptosporangiaceae Goodfellow et al. 1990 (56) (val- 
idation list no. 34), emend. Ward-Rainey, Rainey, and Stacke- 
brandt. Streptosporangiaceae (Strep.to.spo.ran.gi.a'ce.ae. M.L. 
neut. n. Streptosporangium, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Streptosporangi- 
aceae, the Streptosporangium family). The pattern of 16S rDNA 
signatures consists of nucleotides at positions 440-494 (C-G), 
445-489 (G-C), 501-544 (C-G), 502-543 (G-C), no extra base 
between positions 453 and 479, 586-755 (U-G), 613-627 (Pyr- 
Pur), 681-709 (U-A), 1116-1184 (U-G), 1137 (U), 1355-1367 
(A-U), 1436-1465 (GC), and 1422-1478 (G-U). The family 
contains the type genus Streptosporangium (37) as well as the 
genera Herbidospora (83), Microbispora (105), Microtetraspora 
(167), Planobispora (164), and Planomonospora (166). Phylo- 
genetic analyses have been published previously (172, 174). 

Family Nocardiopsaceae Rainey et al. 1996 (127), emend. 
Rainey, Ward-Rainey, and Stackebrandt. Nocardiopsaceae 
(No.car.di.op.sa'ce.ae. M.L. fern. n. Nocardiopsis, type genus of 
the family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Nocardiopsaceae, the Nocardiopsis family). The pattern of 16S 
rDNA signatures consists of nucleotides at positions 440-494 
(U-G), 442-492 (C-G), 445-489 (C-G), four extra bases be- 
tween positions 453 and 479, 501-544 (G-C), 502-543 (A-U), 
586-755 (C-G), 603-635 (U-A), 613-627 (C-G), 658-748 (U-A), 
671-735 (C-G), 681-709 (U-A), 1003-1037 (U-G), 1116-1184 
(C-G), 1137 (A), 1355-1367 (G-C), 1422-1478 (G-U), and 
1435-1466 (A-U). The family contains the type genus Nocar- 
diopsis (100). A phylogenetic analysis has been published pre- 
viously (127). 

Family Thermomonosporaceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Thermomonosporaceae (Ther.mo.mo.no. 
spo.ra'ce.ae. M.L. fern. n. Thermomonospora, type genus of the 
family; -aceae, ending to denote a family; M.L. fern. pi. n. 
Thermomonosporaceae, the Thermomonospora family). The 
pattern of 16S rDNA signatures consists of nucleotides at 
positions 440-494 (C-G), 442-492 (G-C), four to seven extra 
bases between position 453 and 479, 501-544 (C-G), 502-543 
(G-C), 586-755 (C-G), 613-627 (C-G), 658-748 (C-U), 681-709 
(C-G), 1003-1037 (A-G), 1116-1184 (C-G), 1355-1367 (A-U), 
1422-1478 (G-C), and 1435-1466 (G-C). The family contains 
the type genus Thermomonospora (66) as well as the genera 
Actinomadura (89) and Spirillospora (38). Phylogenetic analy- 
ses have been published previously (127, 172). 

Suborder Micromonosporineae subordo. nov., Stackebrandt, 
Rainey, and Ward-Rainey. Micromonosporineae (Mi.cro.mo. 
no.spo.ri'ne.ae. M.L. fern. n. Micromonospora, type genus of 
the suborder; -ineae, ending to denote a suborder; M.L. fern, 
pi. n. Micromonosporineae, the Micromonospora suborder). 
The pattern of 16S rDNA signature nucleotides is as indicated 
for the family. The suborder contains the type family Mi- 
cromonosporaceae. 

Family Micromonosporaceae Krassilnikov 1938 (80), emend. 
Koch et al. 1996 (78), emend. Stackebrandt, Rainey, and 
Ward-Rainey. Micromonosporaceae (Mi.cro.mo.no.spo.ra'ce.ae. 
M.L. fern. n. Micromonospora, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Micromonospora- 
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ceae, the Micromonospora family). The pattern of 16S rDNA 
signatures consists of nucleotides at positions 66-103 (G-C), 
127-234 (A-U), 153-168 (C-G), 502-543 (G-C), 589-650 (C-G), 
747 (A), 811 (U), 840-846 (C-G), 952-1229 (C-G), 1116-1184 
(C-G), and 1133-1141 (G-C). The family contains the type 
genus Micromonospora (111) as well as the genera Actinoplanes 
(36; emended in reference 145), Catellatospora (3), Couchio- 
planes (162), Catenuloplanes (185), Dactylosporangium (165), 
and Pilimelia (72). Phylogenetic analyses have been published 
previously (75, 78). 

Suborder Frankineae subordo. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Frankineae (Frank.i'ne.ae. M.L. fern. n. 
Frankia, type genus of the suborder; -ineae, ending to denote a 
suborder; M.L. fern. pi. n. Frankineae, the Frankia suborder). 
The pattern of 16S rDNA signatures consists of nucleotides at 
positions 82-87 (C-G), 127-234 (G-C), 141-222 (G-C), 371-390 
(G-C), 502-543 (A-U), and 1003-1037 (G-G). The suborder 
contains the type family Frankiaceae as well as the families 
Acidothermaceae, Microsphaeraceae, Geodermatophilaceae, and 
Sporichthyaceae. Phylogenetic analyses have been published 
previously (44, 62, 107, 123, 126). 

Family Frankiaceae Becking 1970 (5), emend. Hahn et al. 
1989 (62), emend. Normand et al. 1996 (107), emend. Stacke- 
brandt, Rainey, and Ward-Rainey. Frankiaceae (Frank.i.a' 
ce.ae. M.L. fern. n. Frankia, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Frankiaceae, the 
Frankia family). The 16S rDNA signature nucleotide pattern 
consists of 139-224 (G-C), 148-174 (A-G), 155-166 (U-G), 839- 
847 (A-G), 987-1218 (G-C), 1059-1198 (C-G), and 1308-1329 
(C-G). The family contains the type genus Frankia (12). Phy- 
logenetic analyses have been published previously (62, 107). 

Family Geodermatophilaceae Normand et al. 1996 (107), 
emend. Stackebrandt, Rainey, and Ward-Rainey. Geodermato- 
philaceae (Ge.o.der.ma.to.phi.la'ce.ae. M.L. masc. n. Geoder- 
matophilus, type genus of the family; -aceae, ending to denote 
a family; M.L. fern. pi. n. Geodermatophilaceae, the Geoder- 
matophilus family. The pattern of 16S rDNA signatures con- 
sists of nucleotides at positions 139-224 (C-G), 157-164 (A-U), 
158-163 (A-U), 186-191 (C-G), 263 (G), 293-304 (G-U), 986- 
1219 (U-A), 987-1218 (A-U), 1059-1198 (U-A), and 1308-1329 
(U-A). The family contains the type genus Geodermatophilus 
(98) as well as the genus Blastococcus (2). Phylogenetic anal- 
yses have been published previously (44, 107). 

Family Microsphaeraceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Microsphaeraceae (Mi.cro.sphae.ra'ce.ae. 
M.L. fern. n. Microsphaera, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Microsphaeraceae, 
the Microsphaera family). The pattern of 16S rDNA signatures 
consists of nucleotides at positions 139-224 (C-G), 157-164 
(G-C), 186-191 (C-G), 839-847 (U-A), 987-1218 (A-U), 1059- 
1198 (U-A), and 1308-1329 (U-A). The family contains the 
type genus Microsphaera (188). A phylogenetic analysis has 
been published previously (188). 

Family Sporichthyaceae fam. nov., Rainey, Ward-Rainey, and 
Stackebrandt. Sporichthyaceae (Spo.rich.thy.a'ce.ae. M.L. fern, 
n. Sporichthya, type genus of the family; -aceae, ending to 
denote a family; M.L. fern. pi. n. Sporichthyaceae, the Sporich- 
thya family). The pattern of 16S rDNA signatures consists of 
nucleotides at positions 139-224 (U-A), 186-191 (G-C), 600- 
638 (C-G), 839-847 (U-A), 987-1218 (A-U), 1059-1198 (U-A), 
and 1308-1329 (U-A). The family contains the type genus 
Sporichthya (91). A phylogenetic analysis has been published 
previously (126). 

Family Acidothermaceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Acidothermaceae (A.ci.do.ther.ma'ce.ae. 
M.L. masc. n. Acidothermus, type genus of the family; -aceae, 



ending to denote a family; M.L. fern. pi. n. Acidothermaceae, 
the Acidothermus family). The pattern of 16S rDNA signatures 
consists of nucleotides at positions 139-224 (C-G), 186-191 
(G-C), 839-847 (A-U), 987-1218 (G-C), 1059-1198 (C-G), and 
1308-1329 (C-G). The family contains the type genus Acido- 
thermus (102). A phylogenetic analysis has been published pre- 
viously (123). 

Suborder Gtycomycineae subordo. nov., Rainey, Ward- 
Rainey, and Stackebrandt. Glycomycineae (Gly.co.my.ci.ne'ae. 
M.L. masc. n. Glycomyces, type genus of the suborder; -ineae, 
ending to denote a suborder; M.L. fern. pi. n. Glycomycineae, 
the Glycomyces suborder). The pattern of signature nucleo- 
tides of 16S rDNA is as that of the type family Glycomyceta- 
ceae. 

Family Glycomycetaceae fam. nov., Rainey, Ward-Rainey, 
and Stackebrandt. Glycomycetaceae (Gly. co. my. ce.ta' ce.ae. 
M.L. masc. n. Glycomyces, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Glycomycetaceae, 
the Glycomyces family). The 16S rDNA pattern of 16S rDNA 
signature nucleotides contains 70-98 (A-U), 127-234 (G-Pyr), 
140-223 (A-U), 229 (G), 366 (U), 415 (C), 449 (C), 534 (G), 
681-709 (A-U), 825-875 (G-C), 999-1041 (C-G), 1059-1198 
(C-G), 1064-1192 (G-G), 1117-1183 (A-U), and 1309-1328 (C- 
G). The family contains the type genus Glycomyces (85). The 
phylogenetic position is shown in Fig. 3. 

Order Bifidobacteriales ordo. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Bifidobacteriales (Bi.fi.do.bac.te.ri.a'les. 
M.L. neut. n. Bifidobacterium, type genus of the order; -ales, 
ending to denote an order; M.L. fern. pi. n. Bifidobacteriales, 
the Bifidobacterium order). The type family of the order is 
Bifidobacteriaceae. The 16S rDNA nucleotide signature is as 
that of the family. 

Family Bifidobacteriaceae fam. nov., Stackebrandt, Rainey, 
and Ward-Rainey. Bifidobacteriaceae (Bi.fi.do.bac.te.ri.a'ce.ae. 
M.L. neut. n. Bifidobacterium, type genus of the family; -aceae, 
ending to denote a family; M.L. fern. pi. n. Bifidobacteriaceae, 
the Bifidobacterium family). The pattern of 16S rDNA signa- 
tures consists of nucleotides at positions 122-239 (G-U), 128- 
233 (C-G), 450-483 (C-G), 602-636 (G-C), 681-709 (C-G), 
688-699 (A-U), 823-877 (A-U), 1118-1155 (C-G), and 1311-1326 
(A-U). The family contains the type genus Bifidobacterium 
(110) as well as Gardnerella (57). A phylogenetic structure of 
the family has been published previously (88, 99). 
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column 2, paragraph 1 



BEERS M., BERKOW R.: "The Merck Manual of 
Diagnosis and Therapy, seventeenth 
edition" 

1999, MERCK RESEARCH LABORATORIES , 
WHITEHOUSE STATION N.J. , XP002318189 
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Re Item III. 

Claims 1-13 relate to subject-matter considered by this Authority to be covered by the 
provisions of Rule 67.1 (iv) PCT. Consequently, no opinion will be formulated with respect 
to the industrial applicability of the subject-matter of these claims (Article 34(4)(a)(l) PCT). 



Re Item IV. 



The separate inventions/groups of inventions are: 

1 . Claims: 1 (partially).2-9,1 0{partially),1 1 -21 ,22(partially) 

Compound of formula I wherein R40 is a moiety of formula II for inhibiting angiogenesis 
and treating a tumor. 

Compound of formula V for inhibiting angiogenesis and treating a tumor. 

2. Claims: 1,10,22 (all partially) 

Compound of formula I wherein R40 is a 4- or 5-membered, N-containing heterocyclic ring 
for inhibiting angiogenesis and treating a tumor 

3. Claims: 1 (partially),1 0(partially),23 

Compound of formula IV for inhibiting angiogenesis and treating a tumor 

4. Claims: 1 (partially), 10(partially),24 

Compound of formula IX for inhibiting angiogenesis and treating a tumor 



They are not so linked as to form a single general inventive concept (Rule 13.1 PCT) for 
the following reasons: 
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The problem to be solved by the present invention is to provide a medicament for inhibiting 
angtogenesis and treating tumors. 

The proposed solution is to use thalidomide analogs in particular: 

1. a) Compounds of formula (I) wherein R40 is a moiety of formula (II) and compounds 
of formula (V) 

b) Compounds of formula (I) wherein R40 is a 4- or 5-membered, N-containing 
heterocyclic ring 

2. Compounds of formula (IV) 

3. Compounds of formula (IX) 

Where claims define chemical alternatives, unity of invention should be considered to be 
present when the alternatives are of a similar nature [compare "Administrative Instructions 
under the PCT", Annex B, Unity of Invention, paragraph (f)]. 

Alternatives chemical compounds are to be regarded as being of a similar nature where: 

(I) all alternatives have a common property or activity 

and 

(ii) a common structure is present, i.e. a significant structural element is shared by all of 
the alternatives, or in case a common structure is absent, all alternatives belong to a 
recognised class of chemical compounds in the art to which the invention pertains. 

The three alternatives 1 ,2 and 3 claimed do not share a significant structural element (their 
structure are markedly different) and do not belong to a recognised group or a class of 
compounds, which may be expected to behave in the same way in the context of the 
claimed inventions. 

Furthermore, the fact that the different alternatives 1a ( 1 b, 2 and 3 can be considered as 
thalidomide analogs cannot represent the single general inventive concept linking the 
different compounds 1a,1b,2 and 3, since thalidomide and certain of its analogs (including 
compounds of formula (I) and (V)) have been already described as antiangiogenic agents 
in the state of the art. 

The document XP1 202664 discloses the antiangiogenic and antitumor activities of the 
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thalidomide analogs of formula (V), CPS13.CPS16 and CPS20. 

WO03/014315 discloses the antiangiogenic and antitumor activities of 3-ammo- 

thalidomide and 2,6-diamino-thalidomide (compounds of formula (I)). 

WO02/064083 and XP2959459 disclose the antiangiogenic and antitumor activities of 3- 

amino-thalidomide (compound of formula (I)). 

In the present application no further technical feature(s) can be distinguished that can be 
regarded as a "special technical feature" involved in the technical relationship among the 
different inventions. Consequently, the present application lacks unity of invent.on and the 
different solutions not belonging to a common inventive concept are identified as the 
different subjects listed below. Each of the inventions listed is a distinct invention, 
characterised by its own special technical feature, defining the contribution which each of 
the claimed inventions, considered as a whole, makes over the prior art, i.e. the specific 
features of the individual group of compounds. 

In reply to the objection to lack of unity, the applicant has paid ail the additional search 

Consequently, this report has been established in respect of all parts of the international 
application. 



Re Item V. 

1 The following document is referred to in this communication: 



D1 " NG SYLVIA S. W. ET AL: "Antiangiogenic Activity of N-substituted and 

" Tetrafluorinated Thalidomide Analogues" CANCER RESEARCH , 63(12), 3189- 
3194 CODEN: CNREA8; ISSN: 0008-5472, 15 June 2003 (2003-06-15), 
XP00 1202664 
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D2: WO 03/01431 5 A (ENTREMED INC ; CHILDRENS MEDICAL CENTER (US)) 

20 February 2003 (2003-02-20) 
D3: WO 02/064083 A (ROUGAS JOHN ; CONNER BARRY P (US); PRIBLUDA 

VICTOR (US); SHAH JAMSHED) 22 August 2002 (2002-08-22) 
D4: LENTZSCH S ET AL: "S-3-amino-phthalimido-glutarimide inhibits angiogenesis 
and growth of B-cell neoplasias in mice" CANCER RESEARCH, AMERICAN 
ASSOCIATION FOR CANCER RESEARCH, BALTIMORE, MD, US, vol. 62, 15 
April 2002 (2002-04-15), pages 2300-2305, XP002959459 ISSN: 0008-5472 
D5: HESS, S. ET AL: "Synthesis and immunological activity of water-soluble 

thalidomide prodrugs" BIOORGANIC & MEDICINAL CHEMISTRY , 9(5), 1279- 
1291 CODEN: BMECEP; ISSN: 0968-0896, 2001, XP001 202654 
D6: DE A U ET AL: "POSSIBLE ANTINEOPLASTIC AGENTS: PART IV - 

SYNTHESIS & ANTINEOPLASTIC POTENCY OF N-SU BSTITUTED ALPHA- 
(4,5-DIMETHOXYPHTHALIMIDO)GLUTARIMID ES & N-SU BSTITUTED BETA- 
(4-BROMOPHENYL)GLUTARIMIDES" INDIAN JOURNAL OF CHEMISTRY, 
SECTION B: ORGANIC, INCL. MEDICINAL, PUBLICATIONS & 
INFORMATIONS DIRECTORATE, NEW DELHI, IN, vol. 16B, no. 6, 1 June 
1978 (1978-06-01), pages 510-512, XP000675183 ISSN: 0019-5103 
D7: DE A U ET AL: "POSSIBLE ANTINEOPLASTIC AGENTS: III SYNTHESIS OF 
6-ALKYL-2- U4'METHOXYPHTHALIMIDO AND 6-ALKYL-3-U3'-4'- 
DIMETHOXYPHENYL GLUTARIMIDES" JOURNAL OF THE INDIAN 
CHEMICAL SOCIETY, THE INDIAN CHEMICAL SOCIETY, CALCUTTA, IN, 
vol. 53, no. 11, 1 November 1976 (1976-11-01), pages 1 122-1 125, 
XP000675187 ISSN: 0019-4522 
D8: WO 94/20085 A (CHILDRENS HOSP MEDICAL CENTER) 1 5 September 1 994 
(1994-09-15) 

D9: GB-A-1 075 420 (CHEMIE GRUNENTHAL G.M.B.H) 12 July 1967 

D10: US-A-3 560 495 (ERNST FRANKUS ET AL) 2 February 1971 (1971-02-02) 

D1 1: WO 98/25895 A (ELI LILLY AND COMPANY; ANDERSON, BENJAMIN, A; 

BECKER, GERALD, W; CARTY) 18 June 1998 
D12: GB 962 857 A (THE DISTILLERS COMPANY LIMITED) 8 July 1964 
D13: US-A-3 314 953 (VAZAKAS ARISTOTLE J ET AL) 18 April 1967 
D14: US-A-5 789 434 (KLUENDER ET AL) 4 August 1998 (1998-08-04) 
D15: SUZUKI, MAMORU ET AL: "Use of a new protecting group in an attempted 
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synthesis of cyclopropyldihydroxyphenylalanine" JOURNAL OF ORGANIC 
CHEMISTRY , 48(24), 4769-71 CODEN: JOCEAH; ISSN: 0022-3263, 1983, 
XP008044808 

D16: SEDLAK M ET AL: "Preparation, 1H and 13C NMR spectra of substituted 2- 

benzoylaminocarboxamides" COLLECTION OF CZECHOSLOVAK 
CHEMICAL COMMUNICATIONS, ACADEMIC PRESS, LONDON, 6B, vol. 
60, 1995, pages 150-160, XP002136842 ISSN: 0010-0765 

D17: US-A-4 092 147 (ASHKAR ET AL) 30 May 1978 (1978-05-30) 
D18: DE 33 32 633 A1 (LUITPOLD-WERK CHEMISCH PHARMAZEUTISCHE 
FABRIK GMBH & CO) 4 April 1985 (1985-04-04) 



D19: WO 97/12625 A (CYTOVEN J.V; GREEN, LAWRENCE, R; BLASECKI, 

JOHN, W) 10 April 1997 (1997-04-10) 
D20: US-A-4 291 048 (GOLD ET AL) 22 September 1981 (1981-09-22) 
D21: WO 99/13873 A (EVLANENKOVA, KLAVDIA STEPANOVNA) 25 March 

1999 (1999-03-25) 
D22: US-A-5 783 605 (KUO ET AL) 21 July 1998 (1998-07-21) 
D23: FOLKES L K ET AL: "Oxidative activation of indole-3-acetic acids to 

cytotoxic species- a potential new role for plant auxins in cancer therapy." 
BIOCHEMICAL PHARMACOLOGY. 15 JAN 2001 , vol. 61, no. 2, 15 January 
2001 (2001-01-15), pages 129-136, XP008044869 ISSN: 0006-2952 

D24: KARBOWNIK M ET AL: "lndole-3-propionic acid, a melatonin-related 

molecule, protects hepatic microsomal membranes from iron-induced 
oxidative damage: relevance to cancer reduction." JOURNAL OF 
CELLULAR BIOCHEMISTRY. 2001, vol. 81, no. 3, 2001, pages 507-513, 
XP008044870 ISSN: 0730-2312 

D25: "THE MERCK INDEX" 2001 , MERCK & CO. , XP002322662 



2 NOVELTY (Article 33(2) PCT) 

2.1 The present application does not meet the criteria of Article 33(1) PCT, because the 
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subject-matter of claims 1-3,5,8,10-12,14-16,20 (as far as it relates to invention I) is not 
new in the sense of Article 33(2) PCT. 

The document D1 discloses the antiangiogenic and antitumor activities of the thalidomide 
analogs of formula (V): CPS13.CPS16 and CPS20. 

2.2 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1,10,22 (as far as it relates to invention I) is not new in the sense 
of Article 33(2) PCT. 

The document D2 discloses the antiangiogenic and antitumor activities of 3-amino- 
thalidomide and 2,6-diamino-thalidomide (compounds of formula (I)). 
D3 and D4 disclose the antiangiogenic and antitumor activities of 3-amino-thalidomide 
(compound of formula (I)). D6 and D7 describe the antineoplastic activity of 4,5-dimethoxy- 
thaiidomide (D6, table 1) and 4-methoxy-thalidomide (D7, table 2). 

2.3 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 17,21 is not new in the sense of Article 33(2) PCT. 
Document D5 discloses the azide derivative of thalidomide of formula V (see page 1281 , 
scheme 2, compound 16). 

2.4 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1 ,10,22 (as far as it relates to invention II) is not new in the sense 
of Article 33(2) PCT. 

Documents D9 and D10 disclose compounds of formula (I) wherein R40 is a substituted 
5-membered, N-containing heterocyclic ring having marked anti-tumour actions (see the 
corresponding passages cited in the search report). 

2.5 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1 ,10,22 (as far as it relates to invention II) is not new in the sense 
of Article 33(2) PCT. 

Document D1 1 discloses compounds of formula (I) wherein R40 is a substituted 4- 
membered, N-containing heterocyclic ring for treating various cancers (see the 
corresponding passages cited in the search report). 

2.6 The present application does not meet the criteria of Article 33(1) PCT, because the 
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subject-matter of claim 22 (as far as it relates to invention II) is not new in the sense of 

Article 33(2) PCT. . ...... 

Documents D12 and D13 disclose compounds of formula (I) wherein R40 is a substituted 
5-membered, N-containing heterocyclic ring, and more particularly the compound 3- 
phthalimido-pyrrolidine-2,5-dione (D12). 

2 7 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1 ,1 0 (as far as it relates to invention III) and 23 is not new »n the 
sense of Article 33(2) PCT. 

Document D14 discloses compounds of formula IV, wherein R25 is methyl and R26 a 
substituted alkyl acid, for treating tumor metastasis (column 166, example 404; cla.ms 
1,8.10). 

2.8 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claim 23 is not new in the sense of Article 33(2) PCT. 
Documents D15.D16.D17 and D18 disclose compounds of formula IV (see the 
corresponding passages cited in the search report). 

2 9 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1 ,1 0 (as far as it relates to invention IV) and 24 is not new in the 
sense of Article 33(2) PCT. 

The document D19 discloses the antiangiogenic and antitumor activities of a compound of 
formula IX, the dipeptide L-Glu-L-Trp (R33 is an amino acid moiety, i.e. any moiety hav.ng 
a carboxylic and an amino group). 

2 10 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1 ,10 (as far as it relates to invention IV) and 24 is not new in the 
sense of Article 33(2) PCT. 

Documents D20 and D21 disclose the use of a compound of formula IX (i.e. L-tryptophan) 
for treating tumors. 

2 1 1 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claims 1,10 (as far as it relates to invention IV) and 24 is not new in the 
sense of Article 33(2) PCT. 
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Documents D22 f D23 and D24 disclose the compounds of formula IX, indole-3-acetic acid 
(D22,D23) and indole-3-propionic acid (D24), for use in cancer therapy. 

2.11 The present application does not meet the criteria of Article 33(1) PCT, because the 
subject-matter of claim 24 is not new in the sense of Article 33(2) PCT. 
Document D25 discloses compounds of formula IX: indole-3-acetic acid, indole-3-propionic 
acid and tryptophan. 

The attention of the Applicant is drawn to the fact, that the mere explanation of an effect 
obtained when using a compound in a known process, even if the explanation relates to a 
pharmaceutical effect which was not known for that compound, cannot confer novelty to 
said process. In the present case, the newly discovered technical effect of inhibiting 
angiogenesis does not confer novelty on the claims directed to the use of a known 
thalidomide derivative for a known purpose (treatment of tumors or cancer). No novelty 
exists, if the claim is directed to the use of a known compound for a known purpose, even 
if a newly discovered technical effect (angiogenesis inhibition) underlying said known use 
is indicated in that claim. 



3 INVENTIVE STEP (Article 33(3) PCT) 

3.1 Should the Applicant have overcome the objections of lack of novelty raised above, an 
inventive step could not be acknowledged over D1 to D4 (invention I), 
D9 f D10,D1 1 (invention ll),D14 (invention III), D19 to D24 (invention IV) as the present 
subject-matter of claims 1-3,5,8,10-12,14-17,20-24, as far as novel, appears to be an 
obvious alternative over said documents (Article 33(3) PCT). 

The antiangiogenic activity of compounds of formula I, V and IX has been clearly and 
unambiguously disclosed in D1,D2,D3 ( D4 and D19. 

The antitumor and anticancer activities of compounds of formulae I (wherein R40 is a 4-or 
5-membered, N-containing heterocyclic ring) and IV has been clearly and unambiguously 
disclosed in D9,D10,D11 and D14. 
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3.2 The same applies to the subject-matter of the dependent claims 4,6,7,9.13,18,19 
which apparently does not contain any technical features which could be regarded as 
inventive per se. The skilled person would have expected for the compounds recited in 
said claims the same activity in the inhibition of angiogenesis as the closely related 
thalidomide analogs of D1 to D4. 



4 INDUSTRIAL APPLICABILITY (Article 33(4) PCT) 

4.1 There are no doubts about industrial applicability (Art.33(4) PCT) for the subject-matter 
of claims 14-24. 

4.2 However, for the assessment of the present claims 1-13 on the question whether they 
are industrially applicable, no unified criteria exist in the PCT Contracting States. The 
patentability can also be dependent upon the formulation of the claims. The EPO, for 
example, does not recognize as industrially applicable the subject-matter of claims to the 
use of a compound in medical treatment, but may allow, however, claims to a known com- 
pound for first use in medical treatment and the use of such a compound for the 
manufacture of a medicament for a new medical treatment. 
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